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[57] ABSTRACT 

Method ami apparatus to enable scanning one or more 
documents, automatlcaily idoitiiying significant key topics, 
concepts, and phrases in the dooiments. and creating sum- 
mary pages for, and hyperlinks between, some or all of these 
key topics. Optionally, documents are divided into 
segments, in order that only the needed segment of a 
hyperlinked-to document need be transferred to a viewer^s 
display. A process nmmng on a conqiuter can be used which 

(a) allows an author to select source documents and then, 
using a semantic analyzer program rutming on a computer, 

(b) automatical^ identifies significant key topics within the 
selected documents, (c) oonq)iles those key topics into 
summary pages, (d) generates presentation pages and 
optionally segmenting the selected documents into smaller 
pieces, and (e) embeds hyperlinks from these summary 
pages to the locations where key topics appear in die 
presentation pages. Different types of summary-page arc 
available, including abstract conoqA, phrase, and table-of- 
oootoits summary pages. A summary page provides an 
index into the source document, and can be appended to tiKC-^ 
source document A mdhod of using a con^puter to hyper- 
link through automatically generated hyperlinks, and a data 
structure which can be used to suppoit such byperlinking are 
described. 
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AUTOMATIC SUMMARY PAGE CREATION 
AND HYPERLINK GENERATION 

FIELD OF THE INVENTION ^ 

The present invention relates to methods and aj^iaratus 
for automatically analyzing and modifying documents, and 
miore specifically to automatically caEtracdng Isey topics, 
such as concepts or phrases, from documoits and generating 
suiumaiy pages containing key topic lists and hypedlnJcs to |q 
the extracted key topics, and even specifically to automati- 
cally generating hyperlink indexes for documents stmed on 
CD-ROM or availaUe over networks such as ttie Internet. 

BACKGROUND OF THE INVENTION 

Document author often provide the ability to a reado" to 
efficiently find and retrieve more information about any 
particular item in a document One method to provide sudi 
an ability is to give the reato a reference. An example of 
such a refmnce is a footnote or citation in a scholarly artide ^ 
or book. A reader can use the footnote to identify another 
book or article, and even the page number in the book 
article, in ordo' to obtain more detailed informatiott. 
Similarly, an index entry in a book^s index points to Che 
places in the book where more infonnatlon regarding the ^ 
term in the index entry can be found. In the prixx art, &e 
author or editor of die infotmatton must manually find the 
key topics which might be of interest to readers, and then 
generate the footnote or index entry which points the reader 
from the footnote or index entry to the point where the topic ^ 
is more fully explained. 

A *hypeilink' is defined as a point-and-dick mechanism 
inq)leiDented on a oonqNitcr whidi allows a viewer to link 
(or jump) from one screen display where a tapk: is referred 
to (called the 'hyperlink source*), to other screen displays 
where nKxre information about that topic exists (called the 
'hyperlink destination*). Hiese fayperiinked screen displays 
can all be of portions of the media data (media data can 
include, eg., text, gmphics, audio, video, etc.) from a single 
data file, or can be portions of a plurality of different data ^ 
files; these can be stored in a single location, or at a plurality 
of separate locations. The hyperlink is the combination of a 
display elemem or a (generally visual) indication that a 
hyperlink is available for a paiticular hyperlink source, and 
a oon^uter program which finds and dispii^s the hyperlink 
destination. A hypedink thus provides a computer^sisted 
way for a human user to effidentiy jump between various 
locations containing infcmnation which is somehow related. 

A 'hypermedia application' is defined as a conqxiter 
application which contains media data and hyperlinks 5Q 
between hyperlink souices and destinations in &e media 
data. 

The people who i^ovide the content (eg., text and 
pictures), edit that information, and who define the hyper- 
links are called 'authccs.* People who use the finished 55 
plication are called Viewers.* Feeble who have computers 
which transmit infonnation for others to view are called 
'database providers/ 

Mor-ait FIG. 1 is a conceptual drawing of a hyperlink. A 
hyperlink is a link between hyperlink source 72, which is 60 
located in a first data file, and hyperlink destination 74, 
which can be located in the same or in a second data file. 
Hyperlink source 72 and hyperlink destination 74 are typi- 
cally displayed on con^uter screen 52 at different points in 
time. Three elements that ccHnprise a hyperlink are: 6S 

(a) hyperlink source 72, whidi specifics a key topic to be 
displayed in a hot area. A *hot area* is a portion of the 



display screen that, if pointed at and clicked 00, will 
cause die conqiuter to execute conqxiter code aidi as 
a hyperlink program 79 which hyperlinks (Le.. causes 
a branch) to a byperiink destination 74. (Typically, die 
hot area is visually indicated by highlighting, such as 
color, a bold font, blinking or underlining, but it may 
contain an icon, picture graf^c, or other visual 
indicatioo.) 

(b) hyperlink destination 74, whidi includes information, 
(e.g., destination location specification 73) specifying 
the location of the text or picture that will be displayed 
if the hyperlink is taken. Destination location specifi- 
cation 73 for hyperlink destination 74 is generally 
stcned in the data file containing hyperlink source 72. 
Hyperlink destination 74 itself can be eidier in the same 
or a different dam file as hyperlink source 72. 

(c) hyperlink computer code 79 that, in response to a 
*viewer action*, causes hyperlink destination 74 to be 
displayed in the context in which it appears. Typically, 
that ^iewer action* covapdses a viewo- clicking on die 
hyperlink source 72. 'Clicking' is defined as pointing 
widi and activating pointing device 54 at a hot area, 
such as hyperlink source 72. A pointing device can 
include a mouse, joystick, or odira* device diat is used 
to select a location on a computer screen and is acti- 
vated by, fo3[ exanq>le» depressing a switch such as a 
mouse button 59, ot otherwise indicating diat die 
cooQNJter should execute hyperiink code 79. Upon 
activation, hyperlink code 79 uses destination location 
specification 73 to locate hyperlink destination 74, and 
to disj^sy that infonnation. 

Apple Coiiqiuter*s HyperCard™ program provided an 
early and widely known program that supports the devel- 
opment of hyp»iedia q^dications and hypedinking. A 
HyperCard author specifies the hyperlink source 72, includ- 
ing a siniple program that is activated when die hot area for 
hyperlink source 72 was dicked-on and diat hyp^links to a 
hyperlink destination. The process of identifying tc^s and 
generating and embedding hyperlinks in the text is a manual, 
labor-intensive process. 

The Internet, and particularly the World Wide Web 
protocol, has brought hyperlinking to use over networks. A 
network is a collection of computers connected by commu- 
nication lines. The Internet is a international network com- 
prised of many heterogeneous sub-networks which link 
thousands of conqxiters which have millions of users, many 
of whom are authors. The Wodd Wide Web protocol 
(sometimes simply called **the Web**) is an interface and 
comnmnications protocol y/tidi is sometimes used on die 
Intmet to make use of the Internet easier: 

nior-art FIG. 2A shows a sinq)lified schematic of a 
network 400, such as die Internet, with four eomputers 411. 
412, 413, and 414 coimected as nodes on the netwc»k. In die 
embodiment shown in FIO. 2A, die nodes are viewa*s 
computer 411« audior*s conq>utcr 412, database imvider's 
computer 413 which presents a Web application to others, 
such as the human viewer at viewer's conqiutcr 411, to use. 
A single user on multi-use conqniter 414 may be a viewer, 
an author andadatabaseprovider at various times. There are 
not necessarily any physical differences among conqiuters 
411, 412, 413, arul 414; diey are sin^y generic conq^uters 
put to different uses. 

Pdor-^ FIG. 2B shows a sinq>le connection between 
viewer's computer 411 and CD-ROM drive 723. 'Down- 
loading* is defined as the transfer of data across network, 
typically from database provider* s coinputer 413 to viewer's 
oonqxiter 411. 'Downloading* can also vef^ to transmitting 
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a documeat from a CaXROM 723 to viewer's coinputar4U generates hyperlinks from the summary pages to the key 

as is shown in FIG. 2B. Altcmatively. a CD-ROM 723 could topics in the documcnf s text Id particular what is needed is 

connect to database provider's computer 413 or nmlti-use an q^>aratus and method for automatically identifying 

computer 414 to provide CD-ROM access to other network scmanticaUy in^xxtanl key topics. What is also needed is a 

conjput^ 5 system and method for automaiicaUy generating home pages 

The term 'document' is defined in a hroad sense as text containing various types of index infonnation and the asso- 

and other infonnation stcrcd in one or won conqwtcr files. dated hyperiinks to other information located on the Internet 

Documoits include everything frcHn sinq)le shost text docu- and the Wet). 
nwntjt to laise computer multi-media databases. Examples 

of databaseiovitooomputCR 413 containing these docu- lO SUMMARY OF THE INVENTION 

mentsinchidc computers in the patent office and the Uteary present invention scans <me or more documents, 

of Ccmgrcss, (Mganizations which have huge volumes of automatically identifies significant key tofrics, concepts, and 

infonnation, much of it alreatty conqjutorizxd and hkwc m phases in the docunMsnts, and creates summary pages for, 

ttic process of bdng oonqnitcrized. hyperlinks between, these key topics. Where the same 

As the many different online databases erf documents such is ^. ^ ^^^^ ^ documaits, one 

as legal hlraies become available on the Web, a mechanism enjbojjinjent of the present invention creates hypfflinks 

to automaticaUy goierate hypalinks to places m these between aU of the instances of that key t<^c. The present 

databases in order to faciUtate vicwir^g across die Internet is invention also (Hovides for segnanting of documents, in 

needed. Indeed, a major and widely recognized problem ^ ^ segment of a hyperlinked-to 

widi the Internet is ttiat, while die Internet has a wealA of 20 ^^,^^^1 need be transferred to a viewer's display, 

information, most users find tt diffiailt to access the infor- embodiment of die present invention indudes a 

One way of organizing infonnation on die Internet in process running on «^~^'='.^<'^>^^^^^ 

to^dmi^SL tune has been to provide users docionents and 

uiucf w im iiii M i f *^ . ^ « J »u t *^ ,c cram runnins on a computer, (b) autcnnaticaUy identifies 

witb an overview interface. caUed a 'baine page, to Ihe 25 gj^^™^ ^ ^ the selected documents, (c) 

infonnation.Al«h««5hah«nepageisoftenina^^ SS.Sfc3^^8uinm.iy pages. (d)gene;a^s 

a visually bto^ tradenuot toe h^cf^ ^^S^Z ^gP^^S^ ^ dLuneets 

contains a key to,rfc summary the P'^*^ Sto smaU« Li S«Zls hyperiinks from these 

by one author or dalal^epro^^ summary pa^s to i.e lo«itionswhe.e^lq?cs appear in 

take a viewer to the information the viewer has chosen. 30 i^* if»»>ittiAn 

At about the »une time as the capabilities of the Internet t«>e presentation pages. In If?^; 

ni owuA oaiA^%*^ «^ n^.j creates summary pages containing various abstractions oi 

have grown. CD ROM (Compa^Otak ^--O^y MonmZSi b^ntainSTselected documents, and 

Memory) drives have become impotUnt po^hetals. j,^ ^le documents. Summary pages are pages 

CD-ROMs today typicaUy contam up to Aout 680 mega- ^^^^ . web biwscr pr^am 

bytes of inf«mation. and genaally conUin mjny of the 35 ^^^.^SS tote of ^ti*cs .ThypeK to 

samefcindsofdoaunents1h«lareaccessMevia4eL|^ Z^TdiTXted doo^nte where Ihe^ljics 

'^"Lf ^'''^^'Jt:^^^^^:^ S^. dS^T^ of summary^^ge are^aiStte. 
or need to be. todaed. A home^age^ "'JS^SJ.w^ SduLg abstract, ^cept phrase, and table-of-contents 
IVpaKnks into the information contained on a CD-ROM IS ^ ^^J^S pagelAmettod of S^H computer to 

^ . ^. u • _i _. ► rf — thioufihautfflnaticallygenaatedhyperiinfcsandadatastmc- 

Anolher technology which is rdevaot to the pr«ent ""^^ can be used to suppoTthst hyperlinking are 
invention is the automatic s«nanUc andy^ of text to ^^J^f ^.7,^ ^^^^^1™™.^^^ is 
identiiyandextn«*k^top«sfau«texmfr Oi««^l-^ ^^proWdes anindej tothe sZTdSmt which 
tanclitoooqKH^ingthissanai*^ ? appended to the source document to provide .nK«usrf,le 

Synta«icaE.«ma«U« asyntarticanalya«^^ 45 ^^^^^^ ^ ^ewer 

document to det«mine how each word «bemg used (a^oe word-processor program or a web- 

some words having idoitLcai spelling have qmte different 
meanings) and dien uses a *lciucon dictionary- (also caUcd "wwsff program. 

a 'lexicon**) which specifies semantic weights assigned to DESCRIPTION OF THE DRAWINGS 

die words in the text reflecting their valne as index entries. 50 

A coQ^Hiter program can use the syntticsized values, or In die following detailed descrqsticHi of the invention, 
semantic wei^its, for words to qualify phrases as key topics. reference is made to die accompanying drawings which 
A user is able to spedfy a threshold value so that die fonn a part hereof, and in which is shown by way of 
conqmter program could select only diose (rtnases greater illustration only, specific exenq)lary embodiments in which 
dian,orequalto,diatspecifi6dtfaresholdvahieask^t(9ics. S5 the invention may be practiced. E is to be understood that 
Known semantic-analysis oon^ter programs do not geur other embodiments may be utilized, and structural changes 
oate hyperiinks. may be made, without departing from the scope of the 

A significant problem widi gacrating information for i»es«it inventioa 
computer-based hyperlink systems is that the amhor must pjQ j a conceptual drawing of a single prior-art 
review die material to be hypolinked, must identify key 60 hyperlink, 

tc^cstowhichtohgperiinfcandmustseti^diehypcrn^ FK5.2A shows a priori network connected to a pluralr 
Tliis is a tim^^nsu^nng and labor^ntensivc process. What of comjuteK 

is needed* and what die present invention provides, is a . . - 

^SildK>dthataSc^ FI0 2Bshawsaprior^CD.ROMdnvecomie<ledtoa 

and phrases in a documMt's text, inserts identifying tofaais 65 «»«9>>^^ 

for ^pedinks to diosc key topics, generates one or more FKJ. 3 shows die flow from source document 2§ dirou^ 
summary pa^ having key topic lists, and automaiicaUy summary page geocrator 48 to resultant documents. 
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FIG. 4 shows a concQ>tual dbrawing of vertical hyperlink- 
ing. 

FIG. 5 including FIGS. 5A-5B shows a ooncqitual draw- 
ing of circular bypedinking. 

HG. 6 including FIGS. 6A-6C shows a conceptual draw- ^ 
ing of hoiizontal hyperlinking. 

FIG. 7 shows the opening screen of one embodiment of 
the present inventioa. 

FIG. S shows a conceptual drawing of the entry page, lo 
summary pages, and presentation pages generated by the 
present invention along with hyperlinks b^een them 

HG. 9A shows an cxanqple IFF data structure for a word 

FIG. 9B shows an exanqile IPF data structure for a 
paragraph. 

FIG. 10 shows the hyperlinking for 26 summary pages, 
one for each letter: 

DETAILED DESCWmON OF THE 

mEPl^(RED EMBODIMENT » 

In the following detailed description of the prefened 
embodiments, reference is made to the accon^>anying draw- 
ings which form a part hereof, and in which are shown* by 
way of illustration, specific embodiments in whidi the ^ 
invention may be practiced, ft is to be understood that other 
embodiments may be utilized and structural changes m^ be 
made without departing finom the scope of the present 
invention. 

An ^anchor* is defined as a wcnxL phrase, or gcaf^c (for ^ 
example, one that might likely he used to locate information 
of interest) which is *ancfa(ned* to its location within the 
context of the file data, as opposed to being fixed to a 
specific niunerical address wifiiin the file. The source and 
destination ends of hyperlinks for the present invention are 
coupled to anchors so Ihey are anchored to a specific portion 
of text <x to a specific uxm ot picture displ^ed on a 
coirputer screen, rather than being associated with a specific 
address in a file. Thus, the anchcr remains with die same 
piece of data when inf (omation is insoted in or deleted firom ^ 
the file, whereas the specific address of that piece of data 
may change. 

The American Heritage dictionary begins its definition of 
'index* as ''somettiing that serves to guide, point out, or 
otherwise facilitate reference . . . ** The tenn 'index entry* is 45 
defined to incUide a term or phrase, with information as to 
die location where more information regarding diat index 
entry can be found. The term 'index* is defined as a gnxiping 
or listing of index entries. An index is often ordered in some 
manner, fox exan^ie by alphabetization. Hypermedia appli- so 
cations usually include text, and often also include pictures, 
icons, graphics, animations, sound, and video (iiK>vies). 

A *web browser' is traditionally defined as a con^Niter 
program which siqyports the displaying of documents, which 
include Hypertext Markup Language (HTML) formatting 55 
markup tokens (disoissed further below), and hyperlinking 
to other documents, or phrases in documents, across a 
networic In particular, web browsers are used to access 
documents across the Intemet*s Worid Wide Web. The 
discussion of present inventk>n defines bodi ^vefo browser* 60 
and 'browser' to include browser programs which enable 
accessing hyperlinked infonnation over the Internet and 
otfKT nctw crks, as wdl as from magnetic dis k, CD-ROM, cr 
other memory, and does not limit web browsers to just use 
over the Internet. Several Internet web browsers are 65 
available, some of them oommcxcially. Two of die best 
known of these. Mosaic and N^cape Navigator are 
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described in Internet Starter Kit by Adam Engst, Cc»win 
Low and Michael Simon, Sec nd Edition, Hayden Books, 
1995.Any viewer of the World Wide Web wiU typically use 
a web browser. Indeed* a viewer viewing documents created 
by the present invention nonnally uses a web browser to 
access the documents that a database provider may make 
availatde on the network. Wd> browsers allow clicking on 
hot areas (generated by source anchors containing a docu- 
ment reference name and a hyperlink to that document) so 
that clicking on the hot area causes the specified document 
to be downloaded over the network and disj^ed for the 
viewer. Most web browsers also maintain a history of 
prcvicHisly used source anchors and di^ay a hot area which 
allows hyperlinking back to the database provide* s home 
page (or back duxNigh the locations the viewer has previ- 
ously 'Visited") so the viewer can always go back to a 
familiar place. 

What makes a web browser on a network such as the 
Internes so powerful is that any of the documents viewed 
wi& the program may be located (or scattered in pieces) on 
any computer coimected to network 400. The viewer can use 
a mouse, or otSict pointing device, to dick-on a hot area, 
such as highlighted text or a buttcm, and cause the relevant 
pc»tion of the referenced document to be downloaded to the 
viewer's coixq)Utcr 411 for viewing. These downloaded 
documents in tiun can contain hyperlinks to other docu- 
ments on the same or other computers. 'Downloading' is 
defined as the transmitting of a document or oth^ informa- 
tion from the database provider's conqniter 413 over a 
netw<Hk 409 to die viewer's computer 411. 

A 'source anchor' is an anchor which is combined with a 
hyperlink source 72 and, typically, an index tenn 69. The 
index term 69 conveys information regarding a key t(^c to 
a viewer. The index term 69 is generally hi ghligbted as a hot 
area to indicate to a viewer that a hyperlink is available. 
Alternative embodiments rq>lacc the index term 69 with an 
icon graphic. A destination anchor 76 is an anchor placed 
in a file at a hyperlink destination 74. A source anchor 75 
typically contains die name of the destination anchor 76 
stored in destination location specification 73 in order that a 
web browser can find and hyperlink to the hypedUnk desti- 
nation 74. Combination anchor 67 is an anchor which is 
combined wiA a oomhtnation hyperlink 77 (which com- 
prises boOi a hyperlink source 72 and a hyperlink destination 
74). In one embodiment, combination anchor 67 is tn^le- 
mented by ushig a source anchcv 75 in dose proxhnity to a 
destination anchor 76. 

Morroation is presented to Worid Wide Wdb viewers as 
a collection of 'documents* and 'pages'. As mentioned 
above, a 'document' is defined in a broad soise to indicate 
text, pictorial, audio, video and other information stored in 
one or more computer files. Mewing such muttimedia files 
can be much like watching televlsiotL Documents indude 
everything from simple short text documents to large com- 
puter multi-media databases. 

A *page* is defined as any discrete file which can be 
downloaded as a single download segment. Ibdinically, a 
web browser does not recognize or access documents per se, 
but instead accesses pages, lypically, one page is down- 
loaded by a web browser as the result of clicking on a hot 
area. A page often has several source anchors 75 with 
hypedinks to various other pages or to specific locations 
within pages. 

One problem with accessing donrnftents over the Internet 
is that noany documents are quite long, and thus can take 
quite some time to download over the netwoak. This means 
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that viewers are often zehictant to access a documeiit aoless 
they know it will be oseftiL The presoit inveotioQ facilitates 
dividing documents into a phuaHty of pages which can be 
effidcntly chosen by a viewer and downloaded, one page at 
a tmc. and only when the particular page desired is refer- 
enced. A page is thus a document which contains a portion 
of a source documrat A source document is a document 
from which derivative documents (such as pages) are pro- 
duced. The source document could be reconstructed &om 
the pages geno^ed from the source document 

A 'summary page* is defined as an overview-type page 
containing summary information about another document 
(or a set of documents, if desired) ajid one or more hyper- 
links to that other document 

A 'presentation page* is defined as a page containing a 
pcHtton or segment of a larger source document lYesentation 
pages provide conveniently sized pieces of the larger source 
document which are downloaded one at a time (rather than 
downloading the entire source document), typically as a 
result of a hyperlink the viewer wants to take into the 
corresponding pCHtion oi the source document 

Prom the point-of-view of a web browser program, pre- 
sentation pages and summary pages are technically indis- 
tinguishable. However, summary pages are normally docu- 
ments that are designed by people to ccmtain hyperiinks to 
presentation pages (or to ^Axr sununaiy pages), and are 
designed for use on the World Wide Web. In the context of 
the present inveatioa. sunomaiy pages are also used to help 
navigate dirougb infcHmation contained on a CD-ROM. 

An 'entry page* is defined as a summary page that has 
been assembled by aperson or cosqniter as an entry point to 
hyperlink to other summaiy pages and presentation pages of 
interest. Note, howcvff, that any page, including sununaiy 
pages and presentation pages can be accessed and/<x down- 
loaded directly, without having to go through an entry page. 

A 'home page* is defined as an entry page used by a 
database provider to provide an overview of other pages 
and/or documents available through die system associated 
with the home page. A home page oitca contains a trademark 
and other flashy pictodal or aesttietic infocmation identify- 
ing the database provider. The viewer ncxmaUy begins by 
diddng on one of the hot areas on a home page which the 
Wodd Wide Wd> uses as an entry page to the information a 
database provider presents. The viewer likely starts to trace 
through a web of hyperiinks to a scries of various documents 
on various oompiiters on a network. (Hence the term Wcvld 
Wide Web.) 

To support Oie Internet and the Worid Wide Web, a 
ma^p language called Hypertext Markup Language 
(HTML) has been developed. HTML has two major objec- 
tives. First, HIML provides a way to spediy the structural 
elements of text (e.g., this is a heading, this is body text this 
is a list, etc.) using tokens which are independent of the 
content of the text A web browser uses these tokens to 
format the displayed text for the particular display device of 
a particular viewec So, for exanoiple, HTML allows an author 
to spediy iq) to six levels of heading inf onnation brwJEcted 
by six dififierent hcading-tokea pairs. Applications (eg., web 
browsers) on different computers then process the HTML 
documents for visual presentation in a maimer customized 
for particular di^lay devices. An application on xte cocn- 
puter could display a level 1 heading as 14 point bold 
Bodini, while an q)p]ication on another annputer could 
display it as 20 point italic Roman. A level 1 sequence is 
heralded wifii the sequence token <hl> and tenninated with 
the token <fhi>. Thus, a beading might be encoded as might 
be displayed as: 
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^l> Hus b R kwel 1 hrarting <fhi> 

foi a level one beading or 



<hi& Hua is a level 6 headiog <n]6> 

for a levd 6 headings As a nurfci^) language, HTML enables 
a document to be displayed within the capabilities of any 
particular display system even though that display system 
10 does not support italic, or bold, color, or any paitiailar 
typefoce or size. Thus, HTML supports writing documents 
so they can be output to everything from simple 
monospaced, single-size fonts to proportional-spaced, 
multq>le-sizc, multiple-style fonts. Each con^Niter pxogram 
IS that ^xxsses an HTML document can translate that HTML 
document into a di^lay format siq^fKHted by the hardware 
it will run on. 

The second and more inqxHtant aspect of HTML, for die 
purposes of die present invention, is that it provides a 

20 mechanism to incotp o r a te hyperiinks wittiin a single docu- 
ment and between documents located at different nodes on 
the Internet These hyperiinks can contain addresses of 
documents anyplace on the Ihtemet HTML is described in 
The HTAfL Manual €f Style by Lany Aroason, ZJff Davis, 

25 1994. 

FIG. 3 shows the flow from source document 20 through 
summary page generator 4# to resultant documents M. In its 
most general form, summary page generator 40 is a program 
running on a oon^ter whidi automatically analyzes textual 

30 data in a source document 20, and using wei^ting rules 
determines fr<nn die textual data what are the mo^ signifi- 
cant phrases (i.e., strings of words), and generates a presen- 
tation page 151 whidi contains textual data from source 
document 20 plus special codes embedded in diot textual 

35 data^thecodes which specify to anodier program (generally 
a browser) v^ere those significant phrases are. 

Summary p^e generator 4# is typically a conqKiter 
program that processes one <H' mcH« source documents 20 to 
produce one or more output summary pages 62, and 

40 of^onally, produces entry page 78 and divides source docu- 
ment 2$ into a plurality of presentation pages 15t. In one 
embodiment, summary page generatcr 40 runs on an IBM- 
compatible personal conqniter. 
Id one embodiment, a typical summary page 62 contains 

45 kcy-tcqdc index entries diat include hyperiinks to dcstinarion 
anchors where diose key topics sppeat in die presentation 
pages 150 generated from source doomtent 20. Various 
types of summary pages 62 are created, for exaoaple, scpsh 
rate summary pages can be created whidi contain a table- 
so of-K»atents, a ccmcq>t index, a phrase index, cr an abstract 
indesi, respectively. In one embodiment, summary page 
generator 40 also generates an entry page 78 which contains 
source anchors 75 having hyperiink sources 72 to die 
various summaiy pages 62, and in one dnbodimetit, opdon- 

55 ally to presentation pages 150. In an alternative 
embodiment, summary page generator 40 combines all die 
summary pagps 62 on a single sunonaiy page 6X Akey tofac 
index oitry is an index term 69 for the key topic and an 
associated source anchor 75 or combination anchor 67 that 

60 arc typically hyperlinked to occurrences of that k^ topic in 
the source dociunents 20 or thefr derivative documents (Le., 
presentation pages 150). 

The viewer begins navigating a document or database of 
docuuKnts starting at an entry pag 78, and from there, 

65 hyperlinkingtooneof several sununary pages, which in turn 
hyperlink to presentation pages, whexc data from the actual 
source docummts are displayed for the viewo: In some 
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cases, sudi as a key-phrase aimmary page 10# whidi more-d^ailed entry such as ao abstract entry. Although 

contain hyperiinks to an abstract sununary page 14Q, one h rizontal hypcilinks 95 need not be circular, the hMlzonial 

summary page 62 wfll hyperlink to anottier sumnaary page hyperiinks 95 shown in HG. 6 arc dicular hyperlinks that 

^ hyperlink through combinati n andiors 67. In one 

A summary pa^ 62 could fit on a single computer di^lay 5 OTbodinffint, summaiy page genoalor 40 scans all die 

screen, cr could be tens of thou sands of Unes of text v^liich summaiy pages key topics and inserts honzontai hyper- 

aie scrolled, as in a word processOT. In one embodiment links 9S to those key topics in fee summary pages, 

presentation pages 40 are derivative versions of source In one embodiment, only one of hoiizomal hyperlinks or 

document 150 that contain embedded hyperlinks inserted by vertical hyperiinks can be selected as being dicular hypcr- 

summaiy page generator 40. lO links. However, in another embodiment boA types are 

In another cmboduoent, source document 20 is its own scleaed as circular, and two sets of dicular hyperlink could 
presentation page 150, especially if source docum^ 20 cyde througji the same summary page list entry. In yet 
aheady contains hypcxUnks and/or hyperlink destinations another embodiment a separate icon fot horizontal hyper- 
inserted by an autiior before being processed by sununary links is eliminated and eadi circular hyperlink cycles 
page gen^ator 40. ts through both the presentation pages and the summary pages. 

There are three kinds of hyperiinks diat can be gcnwatcd; In one embodiment, since both horizontal hyperiinks and 

Vertical hyperiinks: \fertical hyperiinks 91 are single-hop vertical hypalinks arc used, a horizontal hyperlink icon 

hyperiinks as shown in FIG. 4. Each vertical hyp^link M appears near ttie source anchor so the viewer can eith<a- click 

hypcriinkstDoncinstanccof a key topic in the presentation on the highli§jited k^ topic term in the source andior to 

pages. As many hyperlink source ancha entries 72 for 20 hyperiink through vertical hyperiinks or didt on the hori- 

vatical hyperiinks 91 arc created in summary page 62 for a zontal hyperiink icon to hypcdink tfirough hcnizontal hypcr- 

key topic as them are instances of that key topic in the links. 

picsentation pages 150. In one embodunent, summary page When circular hyperlinks arc used, a combmation anchor 

generator 40 locates each significant instance of a key topic €7 sudi as shown in FIG. 5, is both a source anchc^ and a 

by using semantic analysis on source documents 20. 25 destination anchor, so thwe are, in eflFect, three forms of 

Circular hyp^iinks: When circular hyperlinks arc aribeddcd hyperlinks: Source, Destination and ComMna- 

gpnerated, only one combmation andior entry 67 is created tion, A source anchor 75 specifies a hot area to be higji- 

in summary page 62 for each key topic, no mattcx how many lighted and die name of the location in a document to \*^ch 

times that key topic q>pcars in picsentation pages 150. That to hyperlink. A destination anchor 76 specifies the name of 

combination anchor entry 67 is drcularly hyperlinked 30 a in a document in order dmt a hyperlink can go to that 

throu^ all ttie instances of that key topic in the presentation destination place. A combination anchor 67 contains both 

pages 150. FIG. 5 shows a concqrtual schematic of a circular source anchor and dcstinaUon anchor types of infcnnation 

hypoiink starting at cnnbination anchor entry 67 in sum- and is described now in more detail 

mary page 62 and hyperlinking through each oombinarion A conabination anchor 67 is a destination anchor that ends 

anchor 67 in presentation page 150, each of which, bdng a 35 one hyperiink combined with a source anchor that begins 

combination anchor 67 in a circular hyperlink chain, is bodi another hyperlink. 

a source anchw and a destination anchor: The combination The vab to hyp^nk' is defined as the clicking on a 

anchor entry 67 on summary page 62 aUows votical hyper- source anchor to go to a destination andior, and includes 

link 91 to the first instance <rf that key topic in presentation f oUowing down a chain of hypalinks by continuing to dick 

page 150, which in turn through the key topic's oombinatioD ao on combined anchors. 

andior 67 allows hypfflink 92 to the second instance iRiiich To generate a combination anchw 67, summary page 

afiows hyperlink 93 to a third instance of the key topic, and generate 40 embeds a hyperlink wbkh (a) identifies index 

so on, untfl the final instance of the key topic aUows torn 69 for Ae key topic in ordcx that the key topic term can 

hyperiink 94 back to the combination andior entry 67 in the be highlighted as in a hot area, (b) specifies a name for 

summaiy page 6Z 45 destination anchor 76 fOT the aariblnation anchw, hi order 

In an alternative enibodiinent, die final instance of flic key that the ccanWnation anchff can be found and hyperiinked to 

topic aUows a hyperiink back to the first instance of the key as a destination, and (c) specifies destination location speci- 

topic in the presentation page ratiier than to tiie summaiy fication 73, used to find the location hi a document to ^^h 

page. The psefeired embodiment uses hyperlinking bade to hypcrimk when the hot area is clicked on. 

through die summaiy page, since this function gives die 50 If die text m which one wanted to embed a hyperiinkis, 

viewer visual feedback every time the viewer completes a "One should identify the essence of the idea if one wants to 

hyperlink cyde. think dearly.", and the key tojac term is 'the essence'*, then 

One embodiment of die presait invention is mduded in in order to insert a combination anchor 67 conq^mg a 

the AnchoiPage™ program by the assignee of the present delation anchor 76 and a source anchor 75, sununaiy page 

mvention. 55 generator 40 diangcs die tott to be: 

Circular hyperiinks are an alternative to nmi-drcular 

hypali^ which have tte advanug. of making Ihc luvl- ^.S^SS^i^Sf^TrSI^if - 

gation to all mstances of a key topic easier and/or faster, thus ^ 
rcdudng the number of entries in a siunmaiy page by 

allowing a single entry per key tope in die summary page, 60 The above is a typical KTMLfocmat "<A...>*' defines die 

rather than one entry for eadi instance of the key topic in the beginning of an anchor and **</A>** tominates diat andu^ 

presentation pages. and so defines the area where the intervening text is dis- 

Horizontal hyperiinks: FIQ. 6 shows horizontal hyper- played to be a hot area that should be highlighted s that a 

links 9S, Horizontal hyperlinks 95 are hypciiinkB from a mouse dick odier input device activates die hyperiink. 

key-topic entry in one summary page 62 to instances of the 65 The phrase NAME=:"DEF34S76** defines the name of die 

same key-topic entry in other summary pages 62, typically destination anchor 76 widiin the document in order that a 

from a bri^ key-topic entiy sudi as a key-phrase entry, to a hyperlink to the destinati n can find the destination. The 
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phrase HREF=**#CTN03789^ provides the destiDationloca- documents 64 generated arc conq^atible with standard 

tion spedficatioQ 73 used as a destination reloence name HmL documents. Thus aUcrf the tools and user e^)erience 

v/bidi the web Ysowsa will hyperlink to if the highlighted on dealing with HTML dooiments can be used with both 

area is cUcked on or othdwise activated. Hoe ••DEF34S76- liqjui and ou^wt documents for sumnwry page generate 40 

and •'GEN03789*' arc aifoitraiy generated names, but could 5 and an author can cu^omize tiieir documents for summary 

just as weU be XAT* cr "DOG". Where a hyperlink is to page ^nmtor 46 as they would any o&er HTML document 
another document, the name of that document precedes the The pages (i.e., ou^Nit docunacnts 64) ou^t from sum- 

'•r. so for cwsxfAc the hyperlink; mary page g^crator 40 will normaUy be viewed with a web 

HREF="http://www. myscrver.com/uscrl/ browser which makes use of the HTML markups. 
project2#GEN03789^ would hypedink to the label name lo The AnchorPagp™ program of Iconovex Coiporalton 

CTN03789 in flie document userl/|projocl2 at the server provides one exen^lary embodiment of summary page 

computer at the nrtwoA address http://wwwjnyscrver.com. generalOT 4#. When one such embodiment is run, a screen 

The just-desoibed method is only one embodiment <rf similar to that shown in FKj. 7 is presented. This screen 

hyperlinks. In another embodiment, Pot exanqdc, at the allows selection of whidi of the four summary pages to be 

beginning cr end of each document there could be a table of 15 generated. Generation oftable-of-contents summary page 80 

hyperlink numbOT or names along with the locaticm within is selected by clicking on option box 101; a key-phrase 

the document where the source anchor associated with the summary page by clicking on option box 103; a concert 

table entry is located. summary page 20O by clicking on option box lOS; and a 

Entry to the plurality of presentation pages 150 that abstract summary page 140 by clicking on option box 1(17. 

summary page generator 40 generates is typically through a 20 Athrcshold level is also selected (if onty by default) for each 

hyperlink from an entry page 78 which ttie database provider summary page selected. Spinn^ HI specifies number of 

likely will define as their home page. The viewer uses a levels (fr<Hn 1 to 6) of heathng tokwis to use to generate on 

mouse mother pointing device to dick oi a higtilighted hot the tatde-of-contcnts summary page 80; spinner 113 spcci- 

arca of source anchor 72 in entry page 78 that hyperlinks to fics die density of phrases to c<ffiipile for the key-phrase 

one of the summary pages One embodiment, shown in 25 summary page 100; qrinner 115 specifies the density of 

HG. 8, provides four types of summary pages 62: an abstract concepts to generate for concept summary page 200; and 

summary page 140. a concept summary page 200, a key- spinner 117 spedfles die density of abstracts to generate for 

phrase summary page 100 and a table-of-contents sununary the abstract summary page 140. Tlie auAor may also mala 

page 80. Each of these summary pages 62 contains a list of a sdection of hyperlinks, and may sdcct one d the foUow- 

bcy-topicHentiyhypalinkanchMS. These hyperlink anchOTS 30 ing: cirrailarhypedioks for phrase rrfiercnoes at radio button 

may be used for vertical hyperiinks (such as vertical hyper- U9, concept references at radio button 120, or no circular 

Imk 91 of FIG. 5) or horizontal hyperlinks (such as hori- hyperlinks at radio button 121. The audior may also sclert 

zontal hyperlink 95 of HG. 6) refcared to earlier, and they horizontal hyperlinks by cUcking on option box 123. TTie 

may be circular or not user may activate ♦^A-to-Z pages'* for phrase sunmiary pages 

Abstract summary page 140 coii9Hiscs a list of abstracts 33 at option box 109 or for concept summary pages 200 at 

(high semantic content sentences are treated as ^abstracts'; optitm box 126. The user may also sdect (if only by default) 

in one embodonent, abstracts whose semantic content a scgmeail size at spinner 124, and may sdect oAer <^ons 

exceed the tbeshold vahie that the author selected will be by clicking at custom button 128. 
listed in the abstract summary page 14© in the order in which Whoi an author uses aunmary page generatOT 40, they 

tiicy occurred in die tcrt) automatically derived by summary 40 select (at Wock 130 in FIG. 7) one ot mrae documents to 

page generator 40. Cjoncept summary page 200 comprises a process. In one embodiment, diese documents will tyiacally 

list of concepts (wherein *concqpts* arc noun i*rases or ^>pear in a single directory but can be in sevaal different 

noun-verb phrases dial contain higb-semantic-weight words; directories on the author's conqmter. Documents from else- 

in one embodiment, the above-mentioned list of abstracts is where on die Interset are first downloaded to die amber's 

generated; each abstract is ttien examined to detemxine all 45 conqniter, and then processed by summary page generator 

key to^cs; for each detcnnined key topic, a copy of die 40. In anodiercmbodimwst, documents can be sdected from 

abstract is made and 'rotated* so die key topic appears first any place on die Internet for pcocessing by summary page 

to make the 'concept* (tfms scvaal concepts can be gencr- genOTtor40. So, for exan^)le, die audwir could dieoreticaUy 

ated from each abstract); tiic list of concqits is tfien select a srt of documents, some of which came from a 

alphabetized) automatically derived by summary page gen- so computer at Munich, Germany, others from a computer in 

orator 40. Key-phrase summary page 100 comprises a list of Osaka, Jqwn, and otiicrs from the autiicw's own computer, 

key phrases (key phrases are phrases witii a semantic Hie resulting presentation pages 150 and summary pages 62 

weidit; die key phrase is rotated so diat die first word wiU output by summary page generator 40 contain die embedded 

be die highest semantic weight noun, since ttiis is die word hyperiinks that make die location of various documents 

a person namaUy looks up) automatically derived sum- 55 transparent to die viewer. AU of the generated documents or 

mary page generator 40- TW)le.<rf-<x>ntcntBSumnuuy page 80 pages or flic segments diereof are transndtted over die 

comprises a table of contents (generated from the heading network 400 to a viewer counter 411 as needed by die 

tokens insated into source docon^ 20 by its author) viewer. , . . ^ 

automaticatty derived by summary page generator 40. The amhor can activate special lexicon dichonines of 

In one embodiment, bodi die input (e.g., source docunicnt 60 medical, bnshiess, legal, geogr^hical, or odier fields when 

20) and output documents 64 for summary page geiraator running summary page generate 40. A ^jedal Icrioon 

40 must be HTML documents, and so m^ contain hyper- dictionary 195 is a supplemental lexicon dictionary diat 

links to otha documents on odiex Inicret computers. In c ntains words firom die regular lexicon dicti nary 19S but 

anodiexembodiii»id,showttinITC,aanim4Lf6iinatter witti different weights dian di sc used by regular lexicon 
50 is used to convert a wort-processor document 20 into an 65 dictionary 195, and/or contains additional w<wds widi 

HTML source document 52. An in^xHtant feature of flu^ assignedwcights, usuafly related to a qiecial tedinical field, 

cmbodtocnt of summary page gcnrator40 is tiiat die omnit In one embodiment, lexicon dictionary 195 has English- 
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language wocrds. la other cmbodfmrnts, dho' languages arc 
provided, with wdf^t values chosen specifically for those 
languages (i.e^ the individual word weights provided for an 
Eagllsh4anguage lexicon dictionary 195 arc generally noC 
ai^licaUe for the literal translations of those words into 5 
cdiCT languages, since the inqxfftance of particular wo-ds in 
determining iniportant concepts will vary across languages). 
The special weights for a special lexicon dictionary 15^ 
oveoride tiie weights normally provided by regular lexicon 
dictionary 15^. These special weights in special lexicon 10 
dictionary 19*5 provide ^)6dal selection criteria whidi 
results in selection of key topics of special significance to the 
field of that lexicon dictionary 195. These special lexicon 
dictionaries 195 are activated by selecting the custom button 
128. and then clicking an option box that will appear with is 
titles like "legal " '^business." "medical," "geogr^ihical" or 
other term that refers to a specific lexicon dictionaiy. Geo- 
gr^hical names are an exaziq>le cji a special lexicon dictio- 
nary that has nominal value for generating some key topic 
lists and a great deal of value for other key topic lists. lb (me ao 
embodinient, a separate lexicon dictionary of such terms is 
selected and loaded into mem<Hy by summary page genera- 
tor 40. where that lexicon dictionary is comMned widi the 
other words in regular lexicon dictionary 195. As a result, 
summary page generator 40 selects froportionally many 25 
noore phrases as key topics which include words from die 
selected lexicon dic^onary. 

Hie audior causes summary page generator 40 to actually 
generate the summary pages 62 and presentation pages 150, 
by clicking on RUN button 13Z and exits summary page 30 
generator 4% by clicking on (XOSE button 134. 

In one embodiinent shown in FIG, 9A, lexicon editos'39, 
allows an author to create and edit lexicon dictionaries 195 
which are appropriate to the autfaox^s needs. Lexicon edkor 
39 allows addition or ddetion of words ta/fix>m dictionary 35 
195, and allows changing the values of the weights for a 
given word, Le., syntactic value 803 and semantic value 804. 
In one embodinoent additional weighting value fields 
(similar to syntactic value 803 and semantk; vahie 804) are 
provided for each word object 800 in order to handle 40 
particular language analysis situations. In one embodiment, 
summary page generator 40 includes lexicon editor 39. 

The desd^don of the present invention involves text 
However, it should be realized that such text will likely 
occur in documents that include pictures, odier graphics, 45 
charts, sound, video sequences and possibly othex elements. 

Sununary page g&neraUx 40 generates summary pages 62 
which contain key topic lists. Akey topic list is a list of tenns 
and associated hypedinks which are used to hyperlink into 
a presentation page 150 derived from a source document 20. 50 
Four exenqxlary types of suounary pages which are gener- 
ated by one embodiment of summary page generator 40 are: 
Tablo-of-Contents Headings: 
These key topics are identified by HTML heading 
tokens, which an authcH- inserts into a doomieat to 55 
I^ovide information indicating one of six levels ci 
headings. The headings are tHacketed by the heading 
tokens ^I> ... <hl/> through <h6> ». <h6^. In 
other embodiments, other tokois or formats are used 
to indicate such headings. The heading text is copied 60 
and assembled into a taUe-cf -contents summary 
page 80. TBble-of-4xintents summary page 80 is a 
page that contains a key topic list of headings from 
source documents 20 and hyperlinks into presenta- 
tion pages 150 derived from where the headers 65 
appeal in those source documents 20. One embodi- 
ment f summary page genoator 40 allows the 
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author to select iq> to six levels of headings at spinner 
HI (i.e., if the author selects level 3, then heading 
leveb 1, Z and 3 are included in table-of-contents 
summary page 80). In anothger embodiment, sum* 
mary page generator 40 selects the heading levels to 
be used. FIG. 8 shows the fmKess of how summary 
page generator 40 conqnies these headings, derived 
from source document 20, into key tofuc entries in 
tablo-of-contents summary page 80 with hyperlinks 
to the locations in the presentation page 150 (also 
derived from source document 20) that contains 
these headings. The hyperlinks generated by this 
process are shown in FIG. 3. 
Abstracts 

The document is scanned to find high semantic content 
sentences and certain syntactical formations within 
sentences. Hicsc are treated as abstracts. In one 
embodiment, abstracts whose semantic content 
exceed the threshold value that the author sdected 
will be listed in the abstract summary page 140 in the 
order in which they occurred in the text An abstract 
suimnary page 140 is a page that contains abstracts 
that wm automatically generated by summary page 
generator 40 from the source documents 20. In one 
embodiment, each abstract has exacdy one source, 
and abstract entries are hypeiiinkcd to the place they 
appear in presentation page 150. In one cmbodimeni 
if horizontal hyperiinks are activated, entries in key- 
phrase summary page lOf can hyperlink into 
abstracts in abstract suimnary page 140. 
Concepts 

In one cmbodiincnt. all the abstracts listed in the 
abstract summary page 140 are scanned to identify 
concepts. In anoiher embodiment, all the abstracts, 
whether or not listed in the abstract summary p^e 
140, are scanned to identify cono^. ^Concepts' are 
the already-identified abstracte, but wifli die key 
leases pulled out to act as 'headwords* (i.e., words 
placed at the beginning of a phrase in (Kder to 
facilitate ordering, such as by alphab^izing). These 
key -phrase headwords arc noun phrases or noun- 
verb phrases diat contain high semantic weight 
words. While abstracts are each listed in abstract 
summary page 140 only once, concepts can have 
more than one significant, indexable phrase, and thus 
each concept is listed once for each separate key 
f^ase. In one embodiment, significant concq>tfi are 
assembled into a key-topic list and are inserted into 
concept sinmnaiy page 200, where they are listed 
alphabetically according to the key-phrase head- 
words. 

A concept summary page 200 is a page whose key-topic 
entries are key concepts and associated hyperiinks Ifaat were 
automatically generated from source documents. In identi- 
fying ooncqyts, the document is automatically processed by 
summary page generator 40 to identify particulariy high 
semantic wdght key words. If the author includes inedical 
or legal lexicon dictionaries, tenns which are particularly 
sigiuficant to those fields are predominately selected. 

Key phrases 

Key phrases are frtirases with a high semantic weight 
The key phrase is rotated so that tiie first weed wiU 
be the highest semantic weight noun or adjective: the 
word a person normalfy looks up. Key phrases are 
then assembled in alphabetical order a key-phrase 
summary page 100 and hypcriinkcd to the places 
th^ occur in a presentation page 150. A key-phrase 
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summary page 100 is apagc tbai contains a list of The linguistic analyzer translates the ii^ dooiment int^ 
hypcriiated key phrases that were automatically a data stnicture, sudi as the lotcUigem Paragraph Fonnai 
genaated from source documents. thai is shown in FIG. 9A. The Intelligent Paragraph FcHmat 

In one embodiment, as source document 20 is parsed to (IPF) consists of word objects and paragrqjh (rt>jects. Each 
locate key topics, the text (and othCTmultirncdia data, if any) 5 word object 800 contains a stem entiy stem 801 (e.g., 
is copied into fscsentation pages ISO, and destination "open*0, a sufBx 802 (e.g., "-ingT), a syntactic value 803 
anchors axe inserted into the text concspoiKling to each key (e.g., "15" to a singular noun), and a semantic weigjit 804. 
topicplaccdintothekey topic lists fear summary pages 62. Each word object is generally obtained from a re^ilar 
In this embodiment, if horizontal hyperlinks are enabled, lexicon dictionary 19S, One embodimait of regular lexicon 
destination andwrs are also insexted into the appropriate lo dictionary 1»5 contains over 110.000 word objects 800, eadi 
summary pages 62 to en Ale later finding the locatiras of die word object 800 having its own syntactic value 803. seman- 
idcntified key tqrics. Summary page generator 40 next scans tic weight 804, and suffix 802. 

summary page templates 154, one for each summary page la one anbodiment the syntactic value 803 of a wOTd is 
62 to be generated, to obtain infcsmation about how to lay an artntrary code. Id one embodiment, for example, adjcc- 
out the ccnespottdtng summary page 62 (Le., which icons to i5 tivcs are assignedthevahie of I, adverbs 128, sing^ nouns 
use, where to place die icons, etc.). In one embodiment, the 15, plural noons 37. Some wotIs can be different parts of 
author can modify the summary page tenq)lates 154. In oae speedi and arc assigned a value accordingly. For cxanq)Ie, 
embodiment, a summary page ten?>late 154 is also used to in 'Time flies like an arrow, but fruit flies like bananas," die 
provide a ten^datc for presentation pages 150. fir^ wwd ^'flies** is a verb while Ae second wwd "flies" is 

Summary page goierator 40 creates key topic entries with 20 a plural noun; the second w<Hd "like** is a vert), while die 
hyperlink source anchors in die ^jprt^aiate summary pages first is not Similarly, while "invalid** or "adolescent" can be 
62 for the key topics that summary page generator 40 finds both nouns or adjectives (when Ihcy are used as nouns, they 
in the selected documents 20, Summary page generator 40 describe a human agency) and so arc assigned a value 88 
also creates a hyperlink from each key-topic witry in a indicating diey can act as both a noun and an adlecdve. 
suimnary page to an instance of that key topic in the 2S Id one embodim^t, die semantic weight of a word is a 
picsentation pages 150 by filling in a destination location number between L and 63 whidi indicates the importance of 
specification 73 with die name of a destination andior 74. If the word' s stem as an indexable quantity. A high value ^ 
circular hyperlinks arc being generated, tiien sumniary page asagned to specific words like "^i^iysiology" or ^'petroleum", 
generator 4# creates hyperiinks to the next instance of that and lower values to wends like ^^one** ot "number^ which 
key topic in die text far each key-topic instance. 30 are less specific and have less value for indexing. Words like 

One embodiment of the present invention allows defining "bedding" or **bedridden" have a very low semantic weight 
of exclusion zones. ExcUision zones are sections of text that The semantic weights assigned to words is largely a sub- 
are passed over without attenqjting to recognize oeitain key jective determination based od e^>ertence. 
topics or wi&out insertii^ embedded anchors. Exclusion In addition to die word objects, Aere arc also paragraph 
zones are used to avoid crcatii^ embedded hypedinks to 33 objects in the IFF. One possible embodiment of the para- 
taWes, (luoted text or any text through which the author does graph objects is diown in list 880 in HG. 9B. List 880 is a 
not wantto behypeftinked. Exclusion zones, in oneembodi- list pointmg to the wwd objects far Ae w«ds 'THE* word 
meat for exanq>le, indudeai^r text surrounded by &e HTML 810, **QUICK" wc«d 812 and 'TOX" word 814. Hie con- 
piefoanattcd text tokens :<PRE>...</PRE> which arc stan- struct of "sentence" does not crist in die IFF; a period (.) is 
dard HTML. In one such embodiment, some other non- 40 treated as a word just like any othCT word whose syntactic 
standard tokcs dm are in die HTML compatiWe such as and semantic weights are to be found m regular lexicon 
"<EXZ>...</EXZ>- are used to ddBnc cxdusion zones. dictionary 19S. Howcvct a linguistic analyzer can extract 

Linguistic analyzer 42 is a oon^Miter program diat does a scntoices from the IFF fonnats. Paragraph objects deter- 
lingulstic analysis of the source documents in OTdcr to mine the boundaries to die semantic analysis, 
extract key topics. The Syntactica Engine™ is a linguistic 45 Refening to HG. 8, HTML Faimattcr 50 first prcpco- 
analyzer used by one embodiment of die inesent invention. cesses a source document 20 into HIML document 52. 
The operation of a linguistic analyzer is described bedow. Summary page generator 40 transfonns the Him docu- 

First, linguistic analyzer 42 scans the document and looks mrat 52 duougbout die HTML filter 41 widc^ strips HTML 
each word up in regular lexicon dictionary MS and in any encoding from the documents leaving only ttie textual 
specialized lexicon dictionaries 19S that have been acd- 50 content Next, die Syntactica AppUcation Program Interface 
vatcd Fcr the purposes of the jHCsenl inventioo, a lexicon (API) used in diis cmbodhnent of Imguistic analyzer 42 
dictionary is defined as a toWe of words from a natural constructs toe IIT objects f<r ttie words and paragrqAs of 
hmguage. such as English, each woid of which has assod- die source document using one or mae lexicon dictionaries 
ated (me or mme semantic values or otha infomation that 195. Next die API scans the IFF objects of die source 
can be accessed fay con^Niterinograms. Linguistic analyzer 53 document and creates IPF index objects 43 that will be used 
41 identifies and removes suffixes (such as "-ancc," for die taWe-of-oontcnts summary page 80, IPF key phrase 
"-ability,- "-ly- and "-hi^) so diat die stem form can be objects 44 diatwiU be used by die key-phrase summary page 
used. Thus die words "open," '•opening,- "openness" and 100, IFF concept objects 45 that wiU be used by the concqjt 
*V)pens- and would all be treated as the same stem -open.** summary page 200 and IFF abstract objects 46 ttiat will be 
In one embodiment a diflferenl semantic weight is assigned 60 used by die abstracts summary page 140. The API also 
based on the ronovcd suffix; for example* mwe weight is constnuts an entry page 

assigned to "communicatiotts" Aan to "oomirauiication " In The data base format is the jffcviously described Ihldli- 
one embodiment, suffixes arc ronovcd before being looked gent Paragr^ Fbonat (IFF), where index entries are stored 
up in regular lexicon dictionary 195. In an ther as single paragr^ objwts. Refening to HG. 7, die auttwr 
canbodiment the went is looked up in regular lexicon 65 wiU have selected wUch of dicse object views to generate 
(fictlonary 195 whose entries specify both die stem and die selected the buttons 101. 103, 105 and 107 to detexmine 
suffix. which one of the summary pages «0, 80* 140 and 200 shown 
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in FIG. 8 will be gcnaated The Syntactica API also 
generates a list of unbiown words 56. Unloiawii words are 
weeds that the Syntactica API finds in the source docmnents 
txit to do not ^jpcar in regular lexioon dictionaiy 195. 

The linguistic analysis ignores stifBxes and words whidi 
have very low semantic weights. Thus, index entries in a 
circular index could be semanticaUy similar but lexically 
different One embodiment keeps synonym informaticm in 
the dicttcmaiy entries is regular lexicon dictionaiy 195 so 
that synonyms can t>e identified and various references to a 
set of synonyms OHild be found in a single dictiooary entry. 
Thus, if the word "dog" and "canine* were entered as 
synonyms, summary page generator 40 could treat "dog^ 
and "canine** as bdng targets of the same index entry. Thus 
a viewer looking up "dog" could also find a hyperiink to 
xcfcrenocs to "canine**. A related embodiment genontes, 
under the index entry "dog", see "canine** and be hypo"- 
iinked to the index tntry fat "canine**. 

In order to identify key topics in a source document 20 for 
constmcting the summary pages 62, the IFF doomients are 
scanned to identify nouns whose sonantic weight exceeds a 
user-controllable dueshold value. Adjacent words are then 
scanned to see if they fit into a syntactic pattern for a noun 
phrase (or cHher Hnguisdc phrase of interest). Thus adjec- 
tives preceding the noun, and pr^Kisitional phrases after a 
nMin, are identified as part of die noun phrase. The phrase 
can then be given a semantic weight according to a fonoula 
based on Ihe semantic weight and syntactic func^on of the 
words in the phrase. While one enibodiment of siunmary 
page generator 40 recognizes noun phrases, c^er embodi- 
ments also recognize verb phrases, prepositional phrases or 
verb noun phrases. Verb phrases are generally useful in 
generating abstracts, but are not generally as useful for 
generating concepts or key phrases. 

The author is able to control various paraioeters to affect 
the key-phrase odries selected. For exan^e, special tech- 
nical teems may appear in the text which are not found in 
regular lexicon dictionary 195. The autfior can either edit 
these words and their associated wdghts into regular lexicon 
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abstract index list 46 

The auttK>r selects the abstract summary page 140 at 
box 107 in FIG. 7. Abstracts are listed in the order 
they appear widiin a docimient rather than an a^>ha- 
b^cal ordex. To gencxate the abstracts summary 
page, the IFF objects are scanned to find all key 
topics whose value is greater than the Abstract 
thre^old value selected at spinner 117. Rarely will 
there be mm than one hyperlink to an abstract as 
abstract views abstract the meaning from a longer 
span cf text, and thus are usually unique, 
ooncqit index lists 45 

The authcH^ selects the concept summary page 200 at 
box 105. Typically the audicv will select a concept 
threshold value at spiimer 115 that is higher than the 
one for abstract objects. To generate the concept 
summary pages 200, the IFF objects are scanned to 
find all concept key topics whose value exceeds die 
concept anchor thresbc^d 115. The conc^ key top- 
ics that exceed the threshold are rotated so diat the 
noim with the highest semantic beccmies the first 
word. Thus a ^brasc like *t>ig black dog** is rotated 
to "dog, big black**. In anodicr embodiment, the 
concept key topic is not rotated. In one embodiment, 
die concqyt key tofacs are alphabetically ordered for 
display on concept summary page 200. 
key-phrase index list 44 

The author selects die key-phrase summary page 100 at 
box 103 to generate die key-phrase summary pages 
100. Hie IFF objects are scanned to find possible key 
phrases. Those phrases that pass that threshold vahie 
selected at spinner 113 are rotated so diat the noun 
with the highest semantic wdght becomes the first 
word. In one embodiment, these are alphabetically 
ordered for display on a key-phrase suimnary page 
100. 

The author, by setting various parameters, causes sh<Hter 
or longer, <x more or ftwa key phrases to t>e selected as 
indexes. Shorter key phrases typically provide more-general 



dictionary 195 cr aeate a new special lexicon dicttooary 40 key phrases and so iHX>vide more indexes per key phrase, on 



containing them. AuthcH* customization, while not necessary, 
is useful, since customization enables an author to customize 
and run die key-phrase identification to their needs. The 
Syntactica Engine generates a list of unknown words 
56— words which were found in the text but not in regular 45 
lexicon dictionary 195. The author can then add any or all of 
these words to regular lexicon dictionary 195 giving diem 
semantic and syntactic values and then reprocess their 
documents and the updated regular lexicon dictionary 195. 

Returning to FIG. 8, the Syntactica Engine API 
(Application Program Intofaoe) 42 generates die summary 
pages 62 from the IFF for die source document 20. Ibe 
index IFF objects are: 
table-of-oontents index list 43 
The author selects a tablc-of-contents summary page 
at box 101 in FIG. 7. The IFF paragraph objects for 
the source document 20 is scanned to find all die 
headings in a document The number of heading 
Levels output is specified by the ^nimer at 111, and 
is a number between 1 and 6. In one embodiment, die 
table-of-contents summary page 80 is the oidy sum- 
maxy page 62 diat does not use the linguistic analysis 
of the Syntactica Engine i^ed in linguistic analyzer 
4Z One embodiment of summary page goieratar 40 
always generates all 6 heading levels rather than 
giving the user the ability to select the number of 
heading levels. 
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average. Longer key phrases typically provide more ^>ecjfic 
key phrases and therefore fewer indexes per key phrase, on 
average. Which mediod is best is application specific and 
therefore is typically left under control of the author: 

Iheit are some detail differences between die at>stract, 
concept, and phrase anchor view generadcm other than that 
specified by the selection of different dueshold levels. 
Concept entries are derived from abstract entries. Key 
phrase entries are also derived from abstract entries, 
ahhough in another embodiment, key ]^ase entries are 
derived dhrecdy frcxn source documents 20. 

In one embodiment, source doounents 20 are optionally 
segmented while generating presentation pa^s 150 in order 
to reduce the ammint of text that must he downloaded from 
the database provider's computer 413 to die viewer com- 
puter 411 (see FIG. 2A). Thus the viewer can review one 
summary page 62 or presentation page 150, a page at a time, 
each of which requires only a limited bandwidth and can be 
downloaded quickly. In one embodiment, source documents 
20 oon^Hise N data files. Summsiy page generator 40 
converts these into M presoitation pages, wherein M>N. 
The M presentation pages include data firom the original 
source documents 20, plus hypcriink anchors. M is deter- 
mined by die density parameter provided by the authcn: who 
run summary page generator 40. 

At start-up time, die audior specifies the segment rize for 
pages by selecting a value with spinner IM of the start-up 
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screen. Id one embodimait, ilie segment size is measured by time, and rooonstnic^g and outwitting Ihe words. As each 

the number of paragraphs — a paragraph being a string ci wordis to the fHcsentation pages 150 (and stunmary 

text ending with a caniage retuni. The default siting is 15 pages ^ if horizontal hyperlinks are selected), the concept 

paragraphs. When sanmary page generator 40 creates aim- key^ptaasc and abstract bsy-topic lists are scanned to idcn- 

maiy pages 62 and presentation pages ISO at step 47 in FIG. 5 tify places in the tort where tiwse key topics occur, so an 

8, summary page genaalor 40 divides the pages (Le., ou^ ^ ^ inserted. 

documimts 64) ^ ^.^^ la one embodiment shown in HG. U summary page 

inscrte^lTcharoJi^^ ,0 moreseoomlaiy summary pages 250, one (or more) for each 
SEG>, If the viewer dicks on the button, the next segment of the 26 liters in the alphab<A. Secondary summary pages 
is downloaded to the viewer's computer and displayed. 250 arc summary pages 62 which are inserted between one 
Similarly, an anchcr containing a button and <FREVSEG> summary page 62 and the presentation pages 150. Second- 
are inserted at the beginning of each segnoent except the first ary summary pages 250, or A-to-Z pages, are activated by 
When the viewer dicks on one of these buttons, the previous selecting the box 109 or 126 shown in FIG. 7. If activated, 
segmoit is downloaded into the local computer and dis- twenty-six secondary summary pages 250, one f» eadi 
played on the viewer's computer screen. In one i^ttt, wiU be generated as is shown in FIG. 12. In one such 
embodiment, eadi segment is displayed ind£p<aidently,Le., embodimwit, if a secondary summary page is filled, an 
parts of two segments arc not displayed on different parts erf additional secondary summary page is generated and hypo-- 
the same dijylay screen at the Mmctiine. 20 linked to the fuU one. The concept summary page 200 lists 

In one embodiment the web hrawserconcuitei^^ 20 ^^^j^^^^^,^^ 

nmltiiile windows containing summary pages 62 and/or ui«? i** , 

SSeS^onp^s modifies the eadi of these tetters .s hyp^^ 

S^rSSvserTcomWne the adjoining parts of adjacent ^f;^^'^ 

segments for simultaneous and thus seamless display. On aU concepts (or phrases) be^g^^ 

«icountoii«l^^ ^ secondary summary page 250. A-to-Z summary pages allow 

web browser automatically downloads the next or pievious the viewer or author to more quickly find the entry they arc 

segment into the viewer's computer and display them both looking for. 

seamlessly. This seamless display has two advantages. First In an alternate embodiment assunung there are N index 

die seamtess display Aiimi nfltiKt the next segment and pre- entries to be hyperiinked to which form a summary page and 

vious segOKat buttons and the associated viewer actiort ^ E is the number of entries that would fit in the summary page 

Seccmd, the seamless display allows displaying parts of (E ooiesponds to the 26 entries fox A-to-Z pages), then 

adjacent segnunts on the screen at the same time, thus every N/E index entry could be put on a sununaiy page. If 

hiding the scgmcnt^on fixam the viewer the viewer dicfced 00 an entry in the summary page, the web 

Once these parogrqjh objects for the summary pages have browser would hyperlink to a secondary summary page 

been created, an KTML document is created as shown in where the hypalinked-to entry would be the first item, and 

Uocks 47 and 48. At block 47, summary page generator 40 the entries that follow would be those entries b^een the 

genoates summary pages by first loading a summary p^c selected phrase and the foUowing one on the smmnary page 

template 154, usudly a different one far each the sununary 62. 

pages and ^jpends to that summary page tal^)late 154 flie A smnmary page tempiate 154 can be used to define 

index list generated ftom thdr IFF objects for the specific ^ macro sequences which are expanded when referenced later 

summary page. Summary page tca^ilate 154 provide Ihe in a document For cxanqie, one could define the name 

defeuU boileaplate for a summary page 62. Next hyperlinks PHRAS„^ to re{ffesent ttic anchor on the summary page 

insertion 48 in the HTML source documeaot 52 is performed that would hyperlink to flie Phrases A-to-Z page. PHRAS_ 

in cffder to generate the summary pages and the presentation AZ is defined as: 



<t-FHRA5_AZxA HREMTOASB_AZ.lH><IM0SRO=^UTroNjGn'*^A^ 



pages 150. If horizontal hyperlinks option box 123 is 
selected, summary page generator 40 embeds a button 50 
indicating a horizontal hyp^ink and a hyperlink to the first 
occurrence of the index in another other summary page for 
eadi index entry in the summary pages which has a hori- 
zontal hyperlink. 

Each of the summary pages 62 may comprise several 5S 
segments, only one of v^iich may be losuled in the viewer* s 
con^uter and be viewable at one time. This is suggested in 
FIG. 8, where the key-phiase summary page 100 comprises 
se^nents 361. 362 and 363. Similarly presentation page 150 
may conqaise many segments from different documents. 60 

A summary page template 154 is an HTML document 
which has fommtting information and boiletidate text 

In ne embodiment hyp^linkinsertion begins with pages 
which are in file IFF format Oulputting these pages from the 
IFF format to a summary page 62 and presentation pages 63 
ISt on a oon^uter screen is a simple process. An HTML 
document is generated by scaiming the IFF, one word at a 



The above string defines PHRAS_AZ as a macro. String 
'y!-** starts the macro definition (note that the two hyjAens 
represent ncm-fareaking spaces). raRAS_^AZ is fol- 
lowed by the name of the macro, 'THRAS^AZ**, and a 
closing The definition of the macro is ^<A HREF= 
PHRASB^^TP1> <IMGSRO="BUTTONS.GIF^ </A/ 
y* and the maao is tezminated witti: •*<!-/>'*. "PHRAS_ 
AZ " will be replaced b y **<A HREF=PHRASE^J\Z.TPL> 
<IMGSRC=*3UTTON.aiF'> </A>- every place it qipears 
later in the 4ln<^ i n nf"* 

Here, die effect is to create both a name of the hypcdink 
(HRHM*HRASE..AZ.TPL) a nd a button, whose image 
comes from the file **BinTON.GIF*, that will be displayed. 
Thus, the autfior can insert a single statement <!— 
PHRASE_j\Zt-/> in an HTML document ou^ut by sum- 
mary page generator 40 to spedfy both the hyperlinks the 
A-io-Z summary page, and the button diat activates it 
Siunnury Page Tencqilates 154 and Ibkens 

Summary page tenqdates 154 are HTML documents that 
define the layout features which are displayed from sum- 
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TABI£1 



"foi^late Names 


FtmctiODS 




ANCHORJTTM 


Defines the 


'^'il'y P^S^ £0; 


TOCHIM 




taUe of cmlmtB page 80; 


PHRASBAZHXM 


Defines tfae 


phrase A tt> Z page; 


CONCPIAZJOM 


Defines the 


OGQoqiC A to Z 200; 


PHRASRHIM 




key phfase page 100; 


CONCBFTHIM 




ooDCcpt p9gfi 200; 


ABSTRACTinM 


Dc^Dics tiie 


abstnct {^ge 140; 


PRBsrarrjiiM 




presectBtiosi p^gc 150; 
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nmiy pages 62. With summary page taiq)Lates 154, the 
author can iii<tiratf- what types of hypalinks the author 
would like to provide on each summary page 62 or presen- 
tation page 150 (to return from the pres^itation pages 150 to 
the entry page 60, for instance, or to return from any of the 
output pages in output documents 64 (sec FIO. 8) to the 
author's server's home page), v^at icons or text the author 
would like to have serve as the anchor hot areas for those 
hyperlinks, and any other design features or text the author 
want to appear on a given type of output page. In one 
embodiment, default summary page templates 154 provide 
hyperlinks from eadi of the summary page templates 154 f ch* 
summary pages 62 bade to die entry page 7S, hyperlinks to 
retrieve the previous segment and next segment of die 
current view, and added design features like horizontal rules 
to sqiarate the icons from the data on the screen page. All 
of these design elements can be changed sin:9>Iy editing 
the summary page tenqdates 154. 

Id one embodiment, an installation program will create 
template subdirectories for each of the sets of summary page 
templates 154 used by sumnoazy page generator 40. The 
default summary page tonplates 154, for instance, are 
located in the DEFAULT subdirectory. During installation, ^ 
icon files will be copied into the directory containing the 
summaiy page te^^>lates 154 with which diey are associ- 
ated. When the author runs summary page generator 40, the 
necessary icon files are copied from the tcn^latc diicctocy 
into the (firectcxy the author designates as ftie destination 30 
directory for the project in order that the image-source 
references will not requue patii names. This allows the 
author's projects to be poitable as Icmg as the author keeps 
all the project files together. 

In one embodiment, each of the sets of summary page 
templates 154 contains eight ten^lates. T^nplate functions 
and drfault names are listed in Table 1 below. 
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TABLE 2 



APJNI File 



The fiotDoving ekmcnts c 
NETSCAPE = {ai) 



SKIP_JXX:UMENT = 
{011} 
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HIMU_WARNINC35 = 
KTML^WARNINOS^ 
HTMLJILE-EXT = 
FP— 5EGadENT= 
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PAP_5EGMENT = 
CAP_JffiOMHNT = 
AAP_SBaMENT = 
TOC_^SEGMENT = 



PRRASEL^AZ- 



SO 



When editing summary page templates 154 or creating 
new ones, tfie auth<H^ can retain Ihe default filenames, in 
which case the author must store each taiq>late set in a 
separate directory, cs the author can choose new tcn^ilate 
names. If the autfam' chooses new tenq>late names, the author 
can use any legal DOS filenames, but the author must 
remember to indicate those new names in the initialization 
(e.g., APJNI) file in order to make those summary page 
templates 154 active, (see l^le 2, bdow for more infor- 
mation on one embodiment of the APJNI file used with one 
embodiment of summary-page generator 40). Any legal 
DOS extension can als be used for tenQ>lates, but if the 
author wishes to use a browser to view summary page 
templates 154 as they are worked being on, an ".HTM** 
extension should be used since many browsers will fall to 
read any otha extension as an HTML document 
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CONCaT_AZ_ 
PAOBs"^ 



3 be Get ^noa^ the APJNI fife: 
The NetScape browser has dif&cuhy reading 
the escape aciqucDce for a POD^wrflkins space 
(ftiib^;). For full oampfltibiUty wilfa tbe 
NetScape btovoer, set this line to 1 v^tich 
will ooOTvrt an DOEk-tacakixig spaces in &e 
t^ii^ p1 *tT-T Bsd lo Ulc flouxoo data to Ac ASCD 
noacaical escape eequencc <&016O;). The 
NetScape browser aecoiB to haxxik tins just 
fine. Deteili scttii^ ts 0. 
If set to 1, (ben wmniary-pagB gBOtnsa 40 
will siinply dap aii|7 rtngnnriTt in which it 
finds the HTML to be too fenlty to read If 
aet (d 0, QicD summary -page jfenerstor 40 
will disi^y an cmr message and tenmnate 
[f wr ^ aii^ mder this oonditiaiL Defonit 
setting is 1. 

AD waniQgs arc cuffestly sent to disk 
taifaer than display. 

Tbc name of flie fife to wfaicb warnings are 
Gem. The default name is HTML. WAR. 
Sets tbe extenaco for all output fifes. Tbc 
default is .HTM. 

Tbe base fi>mnTTw» for Presentation e^ m c nt s. 
Each segment of the Presentatkm View will 
have a fifemme which begins with this base 
tBODB, is foUowed try mtcgcK, and ends with 
die extension declaiod in die 
KIML-JPILE— E>Cr vartabfe, as in 
SFP23iCIM. Tbe base name must be 
Gumunded by quotes and should be kept, on 
k)Qg duc m ue ms espectatty, foiriy short. The 
Inse name is also case sensitive. This will 
DDt matter on DOS, but if yon ate planning 
en moving die oo^put fifes to a server witb a 
ca8&«eD»tive operatiqg flystem you should 
make tfie case of your base names the same 
as tbe case die filenames will have cn that 
operating system. The Link Case option on 
dB interfooc will override Aus setting. 
De&uli base name is **SFF^. 
Tbe base filename for Phrase segments. See 
note under PP _JSEGMENT. De&ult best 
name is '^SHVP." 

Ibe base ^V"^"*^ Cor Coocept segments. 

See note under [¥_^SECHrfENT. De&ult 

base name is "SCAP.** 

Ibe base filename for Abstract segments. 

See tKite under FP_SBaMBNT. Default 

base name is *'SAAP.*' 

Ibe base fifename for TViUe of Con&ents 

Bcgments. See note under PP_^SEaMBNT. 

De&ull base name is **STOCr 

Ihe filmame for the Phrase A to Z page. 

Ihis p^ge is not segmented, so as many as 8 

characters can be used. Tbe extcnsioo 

indicated in the KIML—HLfi- 

EXT variabfe wiU be used hes« as well. Tbe 

qiiotatioo marks are uec'g&aaiy . The d cfauh 

fifename is *'EAPAZ.** 

Ihe fiWmn^ ft>r the Concept A to Z pa^e. 

See note under PHRASB__AZ_ 

B\GE. Tbc default fifename is "CAPAZ.** 

Ibe fiferwtw for cony p^e 60. See note 

BDiterPHRASB-JgJACffi. 

Ibe defanh fifename b 'TIOME.'' 



50 One embodiment of the present invention also provides 
othnr alternative template sets for the author to choose from 
(e.g., FanQT, plain, etc.). 

Tenqdate Sets and Icx>ns: Viewing the Alternatives 

65 One embodiment also provides a tenqslate master, 
TEMFLArE.HTM, that die author can use to view die 
various template sets used by summary page generator 40 
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and the icons associated with those ten^ilatc sets. To view of the page it describes. The tcnQ>latB ANCHOILHrM. for 

template sets, the author's hrowsCT is directed at TEM- instance, drtennines the layout of the oitry page 60; PRE- 

PLAraffTM. The tenmlate sets are listed as primary menu SENT.HIM detennines the layout of the presentation pa^ 

items. Under each template set, the author wiD find an 150. A typical layout instruction on the PRESENT.HTM 

anchorfor*lcons"andaDanchQrfor'Tlcn9>Iatcs."Choos^ 5 might look libe this: 

Icons- wiU jump the authOT to a page in wbix^ aU die icons u^j^^ hiiep-an™o!ltpl"> 
associated with the template set arc laid out on a single page 
far oMnparison. In some cases these icon sets will include 

eictra icons that are not in use in the tenq^late set as it is Mostof this will seem familiar to HTML users. What wiU 
cuireatly laid out These extra icons are provided so that the lo be unfamiliar here is the use of KTML comment codes (<!— 
author can aibstitute them fc^ any of the icons in the s^ ->) to create special tokms, in this case the *Ttetum to 
without losing continuity <rf appearance. Anchor page (ie., entry page 60)" totou <!-RETAP->. 
Chooring -tonplates" will jump the author to the sum- tokens are necessary in one embodinwnt because of 
i^oosmg ^ , rtto* the special dilemma posed by tcn?>hues. When creating 
mary page ten^ 154 for ftej^ is most^pages, the iuttior is w(^ with ste^^ 
template set The author can dio^ any of the source « ^^^^^^^^^^^^^.^ ^^^^^^^ 
anchors 75 on the entry page 6# to tovd to othCTt^ placement of andiors precisely 
in the set and the authOT should be able to navigate Ac entac ^^^^^e author wishes to have dicm appear. When working 
tenqdate set in die sameway that (he end us» wiU be aWe summary page ttanplatcs 154, the author is aeating an 
to navigate the final oo^t documents 64 produced by the "empty" form which will be filled in at run time with 
tenqdates. ^ variable tcxt-^-die ou^ut docutncnts 64 produced by sum- 
In this embodiment the layout of these smnmary page mary page generator 40 — and the author must set anchors in 
templates 154 will accurately represent tfie layout of die a poridon that is relative to this variable text This requires 
pages in output documents 64 diey will produce with dnee that die audior indicate for summary page generator 40 
exceptions' which type of output goes on which page and which anchor 
Da2-Whcie data WiU be filled in at run time, die tcn?>lates ^ should be associated widi It AU this is acconq)Ushed by 
WiU have a simple data token and a comment indicating means of tokens. The next section wiU disnjss toese tokens 
that this locaZ Is res«ved for data in d^ and dieo a fuUcr d^cussion of editing templates is 
Run-time tokens— Oartain tokens cannot be assigned a provided, 
physical locadon on die ou^ut page until after the data ^ Tokens 
has been filled in. Fw cxaniple, the hcaizontal link icon Tokens are placeholders far data that cannot be fiUed in 
wiUgeneraUyc(HtiBatttieendc^alineofdata, whichwiU untUnin time. In some cases this data wiU be die actual 
only be placed on the page at run time. The ^location" of output of summary page generator 40; in some cases it wiU 
these tokens on the template is irrelevant since dieb be die filenames of files that are created at nin time. Tokens 
eventual location on the output pages is hard-coded in die are placed within HTML comment codes so that ttiey will 
software; they appear on the ten]q>lates only in order to not be visible vfhtu a browser is used to examine die 
define die icons that wfll be used. On the teoqplates used temi^ates. This aUows the teii^»lates to accurately reflect 
in one embodiment diese tokras sppeai near the txHtcmi how die final output pages of output docummts 64 will look, 
of the template. One embodimmt uses four primary categories of tokens: 
Return to Master-^ die bottom of cai^tei^ is a Link destim^tion tokens 
double h(Hizontal rule and an anchor aiarked "Returo to 

Tenq>late Master.'' These elements appear only on die These tokaas detcnninc the destination point of a hyper- 

tenmlates and wiU not be transferred to die ou^ pages. link. For instance, in any project large enougji to be broken 

into segments, the various segments (i.e.f presentation pages 

Template Sets: rhft"g" g the Active 'I^nq>lates 159) of the Presentation View wiU have a Next Segment and 

lDonecB*«Jiine«.wbeBfl.c«itlK,rrunss«mmaxypagc ^^T^^-^^V^ 

Mu wiiv wu^«/vny«u«w , Ml 1 1- • Asl.LJZ., dctcmune precisely where on the next segment or previous 

^"""^ ^J^V^^J^J!.VZ Z segment d^^periink wfll lead die us^^caUy, diese 

^K^fied as dte Template Dae^ to^ wm beS^at die top of die templSU so tf«^ 

box for summary page toiplates 154 wdh die filenames j^^^ jump^Sl lead the user to die top of die output 
listed in die APJNI file. Changing die active template set IS 5^ nyporan juny wui !«« uic ui>« f 

a simple matter of specifying a new directoiy. P*^" 

Desienins Output Pages with Tcnmlates . . 

^^^'E>""'0 ^ -r ~o r hypeiimk; ^Jt^u^**^*! tbkcxa are: 

One difficulty in HTML, espodaUy for new users, is , , .. ^ . , .77 

visuaHzSi^ *c HIML document bci;« edtod^ 55 ^t^^^;X^T^]f^J^^ 

appear when seen through a browser on the web. This oa cutiy (o a "Ubk of Contcnta," tbcD 

difScuhy is easUy ovocome by using a browser as an cikddqgcnit wmcsuseajtmip towbenrertfab 

integral part of die editing process. BeftHcbcglnnhig to edit desdnato uto has bwn pi»cd oo d* iiMe of 

asummaiypagetei!q)latel54,tiieautiiori»int8dieb^ NBJCnJNK ^^^S^^u^^fStstim^cltt^ 

at diattenqdate to open it The browser is left <^n, and die 60 juo^ aD»dbr cbeiui« on ^ 

conqmter is switched to an e<fitQr« and the desired changes prevunk oa M^gocnied p«af^ das muiD fte ckstratim 

to die template arc made. At any pdnt daring die editing jump caned by clkfcgg <n ■i*ieTk«s Segment'' 

process, the author can save the document being woriced on, . 
switch back to die browser, dick on Rdoad, and immedi- 

atdy sec die results of die editing. 65 link source tokens 

In neembodimcnt eadi of die summary page tenqdatcs These tokens maA the source anchor of a hypciUnk, 

154 contains instructions for the design and layout donents where it wfll be placed on the page and what fonn it wUl 
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take whclh» it will be ao icon ot a text string. This 
infonnation could be placed on the template in non-token 
form, writing a regular HTML hr^to create a hyperlink from 
the key phrase pages 100, for instance, to the entiy page 60. 
But if, in the next suite of documents* the teiiq)late names 
were changed, all the handwritten hrefs would be invalid and 
would have to be edited manually. Using a token to create 
this hyperiiok (the token specifies "create a hypedink 
between the key phrase pages 100 and the entry page 60. 
whatever they are called in this suite**), saves that extra 
work. 



Lick sourae tokens sic: 

NEXT Defines tbe "Next SegmenT source anchor. 

PREY Defines tfae *'Prcvious SegmenT source anchor. 

RBXAP Defines tte ** Re turn to Anchor page (Le^ entry page 60)** 

SOUICe 

PAP Defius tfae "Ruaae View" source anchcjr. Appears ooly oa 

the Ancbor p^ge (Le., eotiy pagP 60) templale. 
PAPAZ Defiws tfae "Ruase A to 7* souroc anchor.2 Appears ooly 

on tfie Anchor page (ijc^ entiy p^ 60) template. 
CAP Defitts d« **CoaDept View** source socbor. Appears only 

on die Ancbor pa^ fi^i entry pstge 60) template. 
CAPAZ Defines dtt 't:oDDq^ A to Z" source anchor. Appears only 

on fix Anchor page (Le., entry p«ge 60) template. 
AAP Defines fi» " Ahrtiact ^^ew^ oooioe ancbor. Appears only 

on the Ancbor psge (le., entry p^ 60) tem|rtate. 
TOC DefiiKS the •TW>le of Comects View** source anchor. 

Appears only oa the Anchor page (le., entry page 60) 

tenqilatB. 



Run-tune tokens 



FHRASECIRC Defines a ptirase circular faypcrlmk. 
CONCEPIORC Dednes scoooept circular hyperlink. 
HORZ Defines a horizontal tiyperliii^ 

AAraCX>N Defines icon thai wiU be placed at the 

of an abstract eoCry as its ancbot 



DA3A Placefaokkr for output data. 
dlB Pbccboifer for the filename dfiod at the bottom of 

output pages. 
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data"; on the lYesentation View tenqplate it is read as 
•^Ptesentatitm View data.- A token placed n a template 
whffe it would have no m^^ng (e.g.. a data token placed 
on the template for entry page 60 of FIG. 8) is igoOTod. 

TABLE 3 



'Kkea. yemes and Punctkos 



Name 



10 



Type 



How it haictiocs 



FHRASECIRC Run-time 

CONCEPTCIRC Roo^iiDe 

HORZ Runtime 

AAFECON RuiKiinc 



DATA 

CUE 



20 



TOPLINK 



Data 
Data 



Destination 
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These are tokens which can be defined on the template, 
but whose precise position on tiie page cannot be set until 
run time because they are data-dependent The hcnzontal 35 
hyperiiok icoa for instance, wOl conoe at the end of aphiase 
orconcqit entry, and this entry is not created until run time. 
As a matter of convenience, lun-time tokens are typically 
placed at the bottom of the ten:Q>late, tnit in fact their exact 
position is irrelevant Run-time tokens are: 
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NEXTUNK 

PREVUNK 

NEXT 

PREV 

REXAP 

PAP 

PAIAZ 



Deatinatioii 
Destinatioii 



Source 



Source 
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Data tokens 

These are placeholders for the output data Aat will be ^ 
produced at run time. The data tokens are: 



CAP 
CAISAZ 
AAP 
TOC 



So dice 
Sotttoe 



Source 



APCOMMKNT Commtni 



Definea a plraac circular faypetiii^ 
Defines a cxnxjtpt ccrctilsr byperiink. 
Defines a boriznutal byperLnk. 
Defines Uie icoa Oiat will be plaoed 
at tbe besnming of an abstract entiy 
as itB anchor. 

Placeholder far ootpul data. 
Placeholder far die fikoame cited at die 
bottom of ootput p4ges (diis is an 
optional tdcca Ifait can be omitted. 
It is provided as a comenieoce for 
administratars so ibat if tfaey see an 
error Aey can tntwc easily find its 
source). 

Marks Hit anchor destinatico for a 
direct t^^pertini to one of tfae Ancbor 
P99C8 (ix^ entry pages 60). If the 
source ancbor on tfae Ancbor page 1 
entry page 60) is "Ihbfe of 
Contents^** ^V*"\g on it will cause 
a jump to vriierever this destixEitioQ 
fta^eo has been placed on tbc table-of- 

On segmented pages, this marks tbe 
destination of fite junp caused by 
clk:kiz« on "Next Sef^ncat.** 
On segmented pages, this marks the 
des ti nation of tiie junp caused by 
cbckiog on *T>ievkras Segment** 
Defines tfae "Next Segment" sousce 
ancbor. 

Defines the *n»vk>us SegmenT source 
ancfacr. 

Defines the "Return to Anchor Page 
(Le., entiy page 60)** source ancbor. 
Defines tfae "Ptirase View" somoe 
anchor. 

Defines tbe Tfarase A to 2r source 
ancbor (typicalty the -Pbrase View** 
somoe anchor is tfae same as fike 
*TPlaasc A to Z* souioe ancfaor. 
Only oat of diese b used 00 aiiy 
given inqjcct, dfpmdtng on tfae 
options you haye selected. Ihe same is 
true for tfae *t3onccpt View*' and 
die "Conoept A to Z* anchors). 
Defines fiie **Conoept Vtfw" source 

Defines the **Cbacept A to Z** source 
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Defines the **AbGtrM:t yiervT scuine 
anchor. 

Defines dK *nkbk of CootentB View'* 
Bomcc ancfaor. 

Tbb allows you to ptaoe a comment in a 
twm piatft without haYiDg fiut oommcnl 
transferred to (he output pages. 



The tokens and their funcd(ms for one embodiinent of ttie 
present invention are listed in ThUe 3 below. With &e 
exoepcion of the data toki^, these are all paired sec tags ; that 
is, a token beginning is marked with an "on"* tag such as 
<!-4kuz-> and the end is marked with an *'ofir tag such as 
<!-/horR->. Everything between these two tags is consid- 
ered part of the token dedaratioa In additi n to the token 
name, token location is also part of this token declaration. 
The data token, when placed on the Abstract View tenqiiate, 
is read by the template parser as standing for "abstract view 



Token Format 

^ AH tokens other ttian data tokens and comment tokens use 
the same fmnat, as follows: 

<l-TOKENNAMB->X<A H8EF=^'inLENAMBifIKr>Y<rA>Z 
<lWT0KENNAMB-> 

65 The token begins with the token name placed within an 
HTML comment (<I-TOKENNAME->) so that it will not 
be visible when tiie template is viewed through a farowset 
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The tcnq>latc wfll, therefore, be vay dose iii appearance to (even as comments) to the output page. Everything between 

the output page it is intended to produce. This gives direct the comment "on" and ccHnmcnt "off* codes is con^deted 

visual feedback when editing the tonpJates.'nie token ends part of the commml by amimaiy page gmeiator 41. The 

with a token off comment (<!"/rOKBNNAME->). Eveiy- entire expression will be ignored by the tenq)late parso- at 

thing between these two comments will be considered a part 5 runtime. 

cf the token definition. Data tokens, because they act only as placeholders, have 

The "X** after the first conmient maiics the "pre** area of an even sinq)lcr form: 
the token. Any Legal HTML can be placed here, including 

plain text; it will be inserted in the ou^t page Immediately <!-DXIA-> 

b^ore the anchor defined fox this tctoa Tins is useful if, fcH- lo 

instance, an icon is being defined as an anchor and it is ^ 

wished to have it froceded by a horizontal rule, or poh^s o-cnE^ 
by a paragrqih mark so diat it always begins a new line. 

The next sectioo of the token marks the beginning of the At tun time, the data token will be replaced by the appio- 

aochor and the hypertext reference for the destination of that is pdate HTML output data for that page. The citation token, 

anchor. This is standard HTML, though it is used slightly if cbosoi, will be replaced by the filename of the ou^t 

differenUy here. Iht 'TIIJENAME.HTNr is actuaUy a document 64. 
placeholder in temis of token function. At run time, the 

ten^ilate parser will replace this "dmmny" filenanke with a Tokens for the A^abet 

valid output filename. All the hrefs in the tenqilates could, in 20 ^ *u t-u i ^» a t» 

fact be set to HREF^'O-OCAL" or HREF=FAKENAME and Tokens ^ the alptebctical diaractcre on die Phrase A to 

ifwouldhavenoeffectonthcfunctionaUty of summarypage f and Ae Concept A to Z page to^^^^ "^.^t^ 

„ ' form as other source tokens, except in the case of alpha- 

^^'on the templates used in this embodiment, however, bctical<toaa«^Jcxctwod^ 

doSe^ty kXuned out of this placeholder: If in the 25 defined for ea* chanirtcx. When proc^sing smaU doa, - 

ZvT exile the token were, in fer<r-RErAF->. "^^''^^^^^^^^ T^'Zft 

^isT^l^forreturinngtotheentry^^ ^^^^ 

destination in the hrrf is made into the AnchoTtemplale (see when there are no entries for some lettos of the aft)habet 

Jeexamples in the section ^g Ttmpla,es"r™s will '^^.^^ ^ tL^iS^.^j! f^^^ 

aUowl7author,whenviewingthetemplateairrentiybeing 30 ^"^Z^^^^ ^^ '^^^ ^" 

edited through a browser, to cUck on the "Return to Anchor Concept A to Z page might take the form, 
page (i-e., entry page 60)" loon and be taken to the Andior 

template.ln thlsway.«ie«ui ^ <,^Aiw=*-><AHKEP=.txiNCF^ 

set just as one will be able to navigate fiirough the final ^ 
output set 33 

The **Y** after the destination anchor is the '*tnid** area of j^e opening camment <I~CAFLErs=H-> indicates which 

the token. Whatever is placed here will become the source letterisbeingdcfined. As wUh other source anchcHS, the href 

anchor for the juizQ). Most c^n this will be an icon or some statement is actually a placeholder that will be rq»laced at 

sort of explanatory text The <iA> that follows the mid area ^^j^q^^ |q above exaiiq>le, this href points to (he 

doses the andu^. 40 (^Qnccpt A to Z tenplate as a matter of convenience for 

The '"Z" marks the '*post^ area of the token. As in die editing the templates. The "mid** area indicates the anchor to 

'*pre*' area, any legal HTML can be placed here, including ^ u^e^ ^^en entries for this letter exist In this case, a 

plain body text This is helpful if there is some design sin^e oqpital H is used. Any icon or text character combi- 

dement one wishes to have always associated widi the nation can be designated as the anchor. The **post" area 

source anchor, if, for instance, one wishes to have it fol- as iii<|[cates the anchor to be used when no entries for the letter 

lowed fay a line break or a hodzontal rule. Most of these ^^^^ Here, a cental H in parentheses is bemg used. Once 

dcagn elemoits can also be added to the template itself, again, any icon or character oomtrination can be designated 

outside the token, but there are occasions, especially with ^^^^ 

lun-time tokens, when it is most convenient to have die Tokens for lettears on the Phrase A to Z ate identical except 

dcsignelemcntapart(rfthetoken.TheAAnCONtokBn,for so that they begin <!-^APUCT=H->. 
instance, is the token which defines the icon to be used as tile 

anchor for entries on die Abstract View. One may decide that Editing Ten^^lates 
the iqjpearance of die Abstract View is best when there are 

two spaces inserted between the icon and the first word in The fdlowing example explains editing tcn:q>latcs. In this 

the entry. In tiiis case, one could suni^y add two non- 55 section, several basic temqilates are shown and how they can 

breaking spaces (q>ecial characters which the word ptoces- be edited to incorp«atc various design elements is 

sor uses to specify that the word t>efore and the word after described, 
the non-breakiDg spaces should both be placed on the same 

line, tadicr than breaking at the end of a line between them) Sample Template: Hie Entry Page 69 

into post area of the token d^a^ ^ In one embodiment a template for a basic entry page 61 

Comment tokens are also pam^d set ^kens, bm ftey ^^^^^ ^t a mhihiS the following: 

function much more dimply and thus have a mudi wspLa wmm* ' ^ 

form: • 



<t-APOOMMENr->. . xonnDCit . . . <I-/APCOMBfflNr-> 



1 ^4nha>^smASt><6ik>AMxibpi^titfi<f^^ 

65 <mEADxBaDY> 

This token is provided so that one can place comments in 2 <i-topunk-> 

summaiy page tcnplates 154 and not have fiiem transfcned 
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-oontinued 
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<I-PARVZ-XA 

HREP=T'HRASBAZJn>r>Phrasc9<yA><P> 

<1-PAP~><AHREF=*WffiASBJrrKr'>Phrasc></A> 
<Pxl-/PAP-> 

<I-CAPAZ-xA HREF=*CONC3nXZJfrxr> 

CcqoeptXfAxPXI^^rAP AZ^ 

<1-CAP~xA HREP=^ONCBFTHrM*> 

C<»oeytK(AxPx!WCAP-^ 

<1-T0C-XA HRBP^TOCJmrvnWc of 

Caateiil»<yAXPxl-V]TX:-> 

<1-AAP-XA HRBP=**ABSTRACTJrrhr> 

AbstiacKyAxl-/AAP-xP> 

<B fare£='1idpy/www.toDvexxoiiVH0ME^ 

SRC^^TCONHOMB-CHF* ALiaN=*IIDDLE 
ALT=KErURN><rA><STO0NCfc4coiiDvcx Hook Page 

</BODTx/imflL 



10 



15 
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Line 1 inserts standard HTML tags to indicate ttit begin- 
lungof an inML document^ beginmog and end of the headif 20 
beginning and end of the title, and the beginning of the body 
text. These tags aie necessary not only so that the template 
itself will be in proper IfTML form (some browsers are 
rather finicky and will not read HTML propedy if some of ^ 
these tags are missing), but also so that the final ou^ut entry 
page 60 win be in proper HTML form. 

Line 2 is the 'toplinlfr token. Whenever a user dicks on 
a "Return to Anchor page (i.e., entry page 60)- token on one 
of the other output pages, they will be jumped to the location 30 
of this token. Mo^ often, as in fiiis case, this token will be 
placed at the very top of the template. 

Line 3 is the *'Fhrase A to Z page** toloen and line 4 is the 
**Phrase \^ew" token. Only one of these tokens (or ncith^) 
will be read by the parser at run time, dq)ending on the 
options selected: if a Phrase A to Z page is being generated, 
ttie **Phrase A to Z** token will be read; if Phrase >^ew is 
being generated, but not a Phrase A to Z page, then the 
"Rirase View" token will be read, ffuo Phrase View is being ^ 
generated, both of these tokens will be igncned. 

The hrefs that follows the token declarations are, as 
discussed earlier, simple placeholders that will be r^laced 
at nm time with the zqipiopriatc filenames. In the meantiiiie, 45 
however, the name of tfie appr o pri ate teinplate file is used as 
die source anchor. This is what allows &e templates to be 
hypetlinked in the same fashion as the evenmal ou^t pages 
they will create. 

Following the source anshar is the single word "Phrases" 
that will become the hot word on the output page. In the 
"^post** area, after dosing the anchor, is a paragraph mark 
which ensures that the next token will begin on a new line. 
Note that the paragr^ mark might also have been added to 55 
the template outside of the token. One could rewrite the lines 
to read: 



Howevo:, since one of these tokens is ignored at run time, 
this will leave an extra paragraph mark which can, under 
some drcumstanoes and with some browsers, produce an 
extra, unintended space. S the paragraph marks are placed 
within (he token tags, so that they become part of the token 
definition, one of them will always be ignored at run time. 

lines S and 6 are the "^Conc^ A to token and the 
**Concqrt View** token respectively. These lines parallel the 
structure and function of lines 3 and 4. Lines 7 and 8 are the 
*Table of Contents" token and "Abaract" token respec- 
tively. 

line 9 defines an icon (ICONHOMRCHF) as an anchor 
for a jump to the home page of the Iconovex Cknporation 
server. Line 10 doses the body text and iTTML. As with the 
opening lines, these lines are added to maintain proper 
IfTML form on both the template itself and on the output 
pages it will produce. 

One embodiment of ttns ten^ilate includes defmitions for 
icons and other added Layout elements. The HTML for such 
a modified template looks like this: 
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1 <lfIML><31EA0><litk>AiKl)oiP^ge<ftilk><^^^ 

2 <:l-TOPIJNK-> 

3 <R2xhD$ 6KC=^ANCR.GIF' AIJQN=MIDra£> 
The Anchor PageK/H£> 

4 <HR> 

5 <axcDg>Phast cboosD one of (be fcdlowiog SyDopsas Vkws of (he 

6 <1-APCQMMENT-> Please vobt that only one of the 
foUowing tckeos for Ftnascs and oi» of flie (okcnft for Concepts 
will be used at lus-tnne, depending 00 whether or not one is 
creating an A-to-Z page. Hk same koa has been chosen to be 
used ID eiAMet caae, but diffnent ioons can be used to indicate 
wfaetter an A lo Z page exists, if desired. 
<!~/APCOMMENT-> 

7 <1-PAPAZ-^XA HRHF=-PHRASEAZ.'ra."XlVlG 
SRCs^XlHOUSaXTIP** alisD=aikklb alt=^ThzBses*> 
Pt»aae8</A>&#l60; <l-VPAPAZ-> 

UHOUSEXJIF* ALIGN=MIDI3LE aU=^Thrasea^ 
Ptnae9</A>&#l€0;&tfl6O.<lWI%P~> 
9 <!-CAPAZ-xA HREF=*^ONCFrAZ.TPL"> 
<1MG SRC=^lJLB.CaP' afigii=iiikldfe alt? 
*Omoept^*> 

Coaoeptg<yA>A#160;^ieO; <!-^CA FA2^> 

10 <1--CAP-XA HKKF='XX)NCEFr.TPL"> 

<img src=i*TOLBJGIF" ALK3N=MIDDLE alt=^*Conceptir> 
Ooixxpt8</A>«#l60;&«160;<t-/CAP--> 

11 <1-TOC-><AHREF=ntX:.m*><IM0SRC= 
-SHIPjOIF* aligD=xmddle al^^^^ of Oontent^ 
Ikbk of Coiitent9</A>&0l6O;ft«l6O;<t-mx:^ 

12 <!-AAP-XAHREP=^ABOTRACT.'IFL'^ 
<iing srcrr'mOBRGIF' AIJGfN=Mn)DLB alt= 
-Ah«liacr>Ahstract</AXl-AAP-> 

13 <P> 

14 <1b> 

15 <a hxtS^^'http'J/wvvMaaanreaiJcxja^ 
SaC=^OMHOMRGaP* AUaN=MII>DLB 
AIT=iaSnnEK>^A><STRONa>Ioanovex Home Pi^o</5troiig> 



<!-PAPAZ-XA HHEP=:TTOASEAZJnM'>Phrt3e»</A>C!WPAPAZ^><f> 

<r-PAP~><AlfllEF==TTOtASBJHTfcr>Pbrase»</A><l ~/PAP--xR> 

or 

<?-PAPAZ-xA HHBP=?TORASEAZimr>Plirase^A><l-VPAPAZ-> 
<t--iV^--><AHREP=TMRASBKrM*>Pbr«sw</A>^ 
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16 rf'l- n fff nyypvaTt — xitfxhf>^ ]Btf=^.Jteaip\tStixtn^* 
ali(Z»=auddk>Rctiini to IhnpUte Mttster<^a> 
<I-/APCOMMKNT-> 

17 </BODYix/HrML> 



Id line 3 a decorative icon and capdon has beea added 
followed by a horizontal nilc (line 4). In line 5 body text is 
added. In this case, it has been d^sen to be brief, bat 
explanatory notes can be added here about the document or 
documents being presented. 

line 6 constitutes a taiq>late canmoenL At ran time die 
template parser will ignore everything between the "com- 
ment on" token (<!"APCOMMENT->) and the "conraient 
off*' token (<! -/APCOMMENT->). This token should be 
used whenever it is desired to place something on die 
tenqdatc diat is not wanted to be iransfaicd to ttic output 
pages. In this case, the comment is used to add some 
explanatory text about the tenoplates themselves. 

Lines 7 through 15 are essentially die same as in the 
previous example except that here icons have been added for 
the source ancfacES and those icons have been lined up 
horizontally across the page rather than verticaily down it by 
lenKyving the paragrq>h tags. The non-breaking spaces in the 
'^post** area ctf each totken definition are to space the icons 
mare evenly across the page. At Hne 14 a horizontal rule has 
also been added to separate flie primary content of die page 
from the navigational icon at die bottom. 

Line 16 makes a slightly different use of a comment 
token. Here, two horizontal rules and a source anchor are 
included for the template master (TEMPLATR HTM) within 
the token. The anchor to the teniplate master is provided so 
that one can easily navigate between ten^late sets when 
viewing diem in a browser. It is enclosed in comment tokens 
so that it will not be transferred to the final output pages. 
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San^ Ttanplate: The RrantfBtim P9g/6 
A veiy basic ft ej e n t a t i o n page template mjgbt inchxle tbe 
fisllowing: 

<I-APCCaffllHNT-^><iriML>CHBAD><lin£><^^ 

<yHBAI»<BODYXt-/APCOMMBNT-> 

<UNEXlLINE-> 

<l-FREVLINK-> 

<(-.REXAP-xA HREF=^ANCHOR.'IPL*> 
<aMO SRC=^SMANCRjaiFX/Ax1-VRHrAP-> 
<1-PRBV-XA HRHP=^W5SENT.TFL*^ 
<SSSO SRC='*fOTVXHirx/Ax!-yraBV-^ 

<p> 

<l-DAIA-> 

<p> 

O-NEXT-XA HREFc'1*RESBNT.Tfl^"> 

<IMO SRC==**NHXTX3IP><yA><l-/NE?Cr--> 

<l-APCOMMENT-><yBODY><yinMI><!-VAIODMMEhrr-^ 
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tags for the template within oomment toikens. In this way, the 
browser can read die tags and thus read the tenqilate 
conrecdy* but die template parser will not transfer the tags to 
the ou^t pages. 

line 2 is a destination anchor. If a user is viewing the 
previous segment of die lYesentation View and cUcks on die 
'"Next S^^nt" anchor, be she is jumped to this desti- 
nation. On die first page of any given project (where diere 
can be no *^ext S^;nient'* andi<v preceding it) or on a 
project which is not broken into segments, this token is 
c impl y ignored. Normally diis token is placed at the top of 
thetenqdate. Note that the **toplinl^tokra is not used on the 
Presentation page since summary page generates 40 pro- 
vides no source anchors for direct jumps to die top ctf this 
view <^ the document 

line 3 is also a destinaticMi token, in this case the 
destination of t jump made when a user clicks on the 
'Previous ScgmenT anchor located on a following page. 
Hiis token is ignored on the last segment of any project and 
on projects that are not broken into segments. 

These first two tokens mark destinations on the CHi^t 
pages, but do not produce any visible indicators on those 
pages. 

The token on line 4 will produce die fint visible element 
on tbe ou^t pages. This is die *'Retum to Anchor page (i.e., 
entry page 60)" token, and the icon for that anchor, 
SMANCR.GIF, will appear in die upper left comer of die 
ou^ut page. No explanatory text or capdon is attached to die 
icon. 

Line 5 is the "Previous Segmenf token. In this case die 
icon PREV.GIF is placed immediately to die right of 
SMANCR.GIF and serves as the anchor for the jump to die 
previous s^menL 

The paragraph tag after the *¥revious Segnscnt^ token 
adds a blank line and ensures that the ou^ df^ which will 
replace die data token in line 7, begins on a new line. The 
paxagrqA tag after the data token has the same effect at the 
end of the ou^xit data. 

Line 9 is die "^ext Segment" source token, and the icon 
NEXT.G1F, which is designated here as the anchcv for that 
token, will be the last element on die fnesentation page. Line 
10 is another comment token, in diis case used to formally 
dose die body text and HTML. These tags are placed inside 
a oomment idken for the same reasons as Line 1. 

These ten lines of ITTML on die template will produce a 
presentation page 150 of which the exact appearmce is 
browser-dependent This is a fidly functional presentation 
page 150, widi everything the user needs to navigate through 
die document, but it is not particulariy distinguished in its 
aesthetic appeal. Bdow is an example of how the tenq)late 
mi ght be edited to incarpOTate a few simple design elements 
into die final ou^ page. 



55 



The fh^ line in this example is peculiar to presentatk>n 
pagel50. At run time, summary page generator 40 will look 
at die source docummts fior the ^Dpen Head** (<HEAI») 
and **Close Head** (</BEAD>) tags, and will transfo^ all die 
infonnadon between these tags direcdy to the q^vopriate 
Rcsentation pages. In die process it will also write all die tio 
appn^Hiate HTML tags to opra and dose die document and 
die body text None of diis inf<»mation, dierefcre, needs to 
be indodtd on the tfmpl'^'^ (and will, in one embodiment 
confuse summary page generator 4# if it is induded). 
However, if these tags were not induded on the template in 
some form, many browsers would not be able to properly 
view the template itself. The soludon here is to provide die 



65 



1 <3-APC0MMENT-><HIMl><HBAI)><irn£><mit& 
<^HBAnXBOPYXI-/AFCOMMEKr-^ 

2 <l-4«BXXUNK-> 

3 <t-PRBVLINK-^ 

4 <^-KElKP^xA HREF^ANCHCMLTTL'XIMQ SRC= 
"SBfANCRXnP* AIJaJf=MDin-E><yA><STOONa>*ctum to 
Aufaor pagXfSTKOKa>ANB ^.<l-VR BlAP-> 

5 <1--PREV-XA HREF=-PBBfflNrrraL*'XlMO SRC= 
*VBEV,a5F* AIJDN=MIDDLB></A><STRONa> 
fttrvkHBPago<OTROhJa><!-VPREV-> 

6 <HIt> 

7 Ibb is • ftdl-tBxt facflimUB of tbe origsial i fcin i nirnt Uob vynAxi, 
<A HKEP^'1XX^">P<(A>, bcficates a cireuttr hypedkk. 
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-continued hypcdinfcs, but tbcy do not affect the location of the anchor, 

Hiese totens can be placed anywhere n the teniplate, and 



Clickiog 00 this hjrperlink will take (be viewer to the next grouped at the bottom in this case simply for conve- 

oocuncDce of tbc same phrase. nicDcc. Sincc only onc type of dradar hyperlink can be used 

9 ^nA3A^ 5 in any given jffoject, it would DOt be necessaiy to define l>cth 

10 <p> of them, but fay doing so one can change options at tun time 

11 <HR> and not have to edit the tenmlates agaia 

12 <A HRBF^*HTrP://WWWJCONOVBX COM/ 

HOMKBiiCEJnM'VdMG SRC=^oMRGiF- Line 18 aic identical to the 'template master Unes used 

AXJQN=Mn>DLE><S'iRONGt>looDovGi HamD Puff» on Ihe Anchor page (ie.. entry page 60) tonplate; they are 

<ysr rRONQ >&WBSP;&NBSP; 10 here onW 85 a navigational tool foT the templates and BFc not 

" :I^S^^^?SSSe> tonsfcncd.ofteou4«,i»ges.U^^ 

</AxsrRONC£^«cit ScsnwiK/STiiONGxi- body text and IcrML on toe ten5)late. 

tUEXT~> 

14 <BR> Integration Wtfain a Wont Processor as a Tool 

15 <i-anE-> 15 ^ 

16 <i-PHRASECiRC->P<i-vPHRASECiRC-> Conventional word {accessor programs (WPRs), such as 

17 <t-ODNCEPTaRC->c<!-vcoNCEFrc0ic-> WordPerfect or Word for Windows often include an index- 
is <!-APCOMMENT-><ln><hr><ahree=-.yten«^ Bcneration im)gram, whcxcin a usex manuaUy inserts '*index 

^^^^^^^-^^"^"^ ^s-atp^Twithinthetextofadocumentw^^ 
<i-APCOMMHrrr-o<ffloDY^<fflTMi><iWAPCOi«w!HNT-^ „ waots the index to point An index code specifies infcmna- 

20 - . . 
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tion to the wad-processor program (sudi as an index tenn 
frfuase, whidi is listed in an index — usually alphabetically— 
This example begins by placing the conuncnt tcAcn and ^ ^ ^ a location whidi is used by 

destination anchors in the same location as in the earliar word-prooesscr program to generate a page number or 

exan^)le, Hie •'Return to Anchor page (Le., entry page 60)" reference) in crder that the WPP can genoate an index, 

token is next but here we*vc added a few elements to that 25 convoitional wad processor piogramis, the user would 
token. Tlie text **Retam to Anchor page (ie., entry page 60r have to detonine whidi portions of the document oom- 
hasbeen added as a bold caption, aligned with the middle <tf ^ toermine whoc the index 

the icon. In the '"posr area we've added a non-tweaking points would be, and would have to manually alter the term 
space so that Ihe next icon won't push iqi against the c^on ^^^j^ ^ ^ ^^^^ process running in the 

from this one. ^ word processor program would then scan toe docunumt to 

Line 5 is the "Previous Segment** token wito toe same determine toe locations of each index code relative to page 
dianges as were made to toe "Return to Anchor page (ie., breaks, and thus generate an index (which it often ^>pended 
entry page 60)** token except toat a non-breaking space has to toe end of toe document) containing toe terms entered by 
not been added here, since this icon is toe last element on toe a page-number cross-reference into toe docu- 

line. meat 

In line 6 a hcslzontal rule is added to separate toe Referring again to FIG. 3 (which shows the flow finom 
navigational icons from toe text on toe page. Immediately source document 20 through summary page generator 40 to 
below that horizontal rule some explanatory text has been resultant docuinents 64.) As noted above, in its most general 
placed that will dppcai at toe top of each segment (B's ^ fonn, summary page generator 40 is a program running on 
important to keep in mind that wito hypertext, unlike linear ^ computer which automatically analyzes textual data in a 
text, toe usCT may enter the document at any point by any source document 20, and using weighting rules determines 
number of routes andthatany explanatory text (votoc? notes textual data what are the most significant i^irases 

thatwouldnormallybenecessaiy attoeb^inningof alinear fi^^ strings of words), and generates a presentation page 
text should be placed at toe beginning of each segment of ^ ^ ^ch contains textual data from source document 20 
hypertext) plus special codes en^bedded in that textual data, toe codes 

At line 8 a paragraph tag is added to separate toe boiler- which spediy to anotoer program (in this embodiment a 
plate text firom toe ou^t data which will replace dte data word-processor program) where toose significant jrfirases 
token at line 16. At lines 17 and 18 a paragr^h tag and a are. 

horizontal role are placed to separate the data from the ^ jg ^ji^ embodiment die present invention is integrated as 
navigational icons that will appear at the bottom of toe 3 built-in or add-on feature (sometimes called a *toor) into 
segment a conventional wcrd processor program^ such as WordPer- 

lo lines 12 through 13 anchors for returning to a server's feet or Word for Windows. Such a tool is usable from within 
home page and for retrieving toe next segment of toe toe word processor witoout resort to switching to anotoer 
document are placed. The form for toese tokens is identical 33 program. IWo examples <tf such tods are toe speU-cfaecker 
to toe source tokens used at toe top of the twplate. Once and toe spreadsheet tools vM<h are sometimes available 
again non-breaking spaces are added b^een toe icons to witoin word-prooesscr programs. 
sq)arate tocm some. One such embodiment of the present invention is an 

At line 14 a break tag is added to start anew Une witoout indexing toed availaUe witoin a word-processing program 
adding blank space, and after that a citation token is placed ^o which, when activated toe automatically identifies 
so toat toe filename will ^ipearin toe lower left conur of toe key topics and phrases in a document's text and inserts 
segpaent This is the final visual elonent defined for toe identifying index codes for the index-generation program 
ou^ut pages. into toe word-processor document for later use by the word 

Lines 16 and 17 are run-time tokens for phrase circular processor program to generate an index to those key topics, 
hyperlinks and concept circular bypoiinks respectively. 6S In particular, one such embodiment automatically idoitifies 
These tokens are purely "definitional'' in function; they semantical^ in^Kxrtant key tc^ics within an integrated 
describe what will be used for toe anchor fct their respective word-processor environment inserts index codes, and gen- 
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crates an index whicfa contams cross-ctfacQccs from the 
index to each occuireace of ^ subject ideotifiedL In one 
such embodiment, the cross lefoences are byperiinks which 
are "activatahle** when a viewer elcctroaicaily views the 
docmncQt The iHcsent invention thus scans a document 
from within a word-piooessor program, automatically iden- 
tifies significant key topics in the document, and aeates and 
inserts index codes foi &ese key tO|ACS, alleviating the 
autfa<Hr frxxm having to use mental steps to identify the key 
topics and manually enter the index code and the tcnn to be 
used within the index entry. 

One cmbodinncQt facilitates a process wherein the author 
later reviews and can edit the index codes inserted. This 
allows die author to fine-tune the resultant index by manu- 
ally adding or removing index codes. The word processor 
program theo processes the resultant document to generate 
the resultant index by conventional means. In one 
embodiment, the index contains hypedinks to take a viewer 
of the oon^uter-stoied document from an index term in the 
index to tiie location in the document which cause the 
generation of that index tenn. In one enobodiment, the final 
document is stoied on a computer-readable medium, such as 
magnetic disk storage, CDROM, or a network such as the 
Internet 

One embodiment of the present invention includes a 
process tunning in a word-processor program on a conq»uter 
which (a) allows an author to select index generation for a 
document being processed (edited) and tiien, using a seman- 
tic analyzer program running on a conqiuter, (b) automati- 
cally identifies significant key topics within the document, 
(c) genorates and embeds index codes into the text of the 
document The index codes are later used to generate an 
index which cross-references each term or phrase in the 
index to the location within the document which explains 
that term or phrase. In one embodimrat the cross references 
are hyperlinks which are used by a document-viewing 
program (such as a web browser programi, a word-processor 
program, or other browser program capable of showing a 
viewer the hot areas of the source anchors and hyperlinking 
to the destination anch<H^ ^en the viewer ciidte on a hot 
area). 

In another embodiment die cross reference is a numeric 
or alphabetic indication is provided to the viewer which 
enables the viewer to go to the teferenoed destination by 
ising that indication, such as the viewer would using a 
conventional index. 

In one embodiment, a single integrated con^Hitci program 
takes a conventional conqHiter-stofcd document as input and 
provides the viewer with an indexed and/or hypetiinked 
view of a semanticaUy-analyzed fosm of that document, 
wherein the viewer can view, scan forward and backwrnd, 
search for text strings as in a conventional text-viewmg 
program, as well as having semantically inqportant key 
topics marked and indexed, and having hypedinks between 
the index and key topics and/or betweai key tofrics as 
marked in the text of the document 

In one embodiment, the index contains abstracts as 
explained atxive. In another onbodiment, die index contains 
concepts, as explained above; In yet anodier embodiment 
the index contains key tof^cs, as explained above. In yet 
another embodunent, tite index contains multqdc parts, each 
port containing ne of more of the just-Hsted types of index 
tcnns. 

Another aspect of tfie present invention is a fnogram, or 
a process within a program, which ocmvexts a document 
from a given conventional mark-19 language, such as might 
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be used or ou^wt from a conventional word processor 
program such as WcudFeifect or Word for Windows, into 
HTML. HTML is the prefored language of tiie Worid-Wide 
Web; however, since HTML is only a half-<lozen years old, 

s there are relative^ few docum^ts in HTML form. Many 
major word-processor iHOgrams provide options for saving 
files in 'TOch-TEXt F<Hmat" CRTF). RTF uses the standard 
ASCH diameter set to recod the stylistic features of a 
document, in (H-der dial those features can be presoved with 

10 the document whra the document is transported by a path 
which recognizes only the single ASCH characto- set (e^.. 
a path such as most e-nuil), or when the document is being 
ported from one word processor to another. One such 
embodiment recognizes the f crowing KTP style indicators 

15 and conveats them into the ^qpsopdate HTML tokens: 
Headings 1 tiiough 6, 
Body paragr^s. 

Special cbaractos (e.g, &, <, >, and non-tecaking space), 
^ Character emphasis styles (e.g., bold, italic, and 
underscore), 
Bull^ lists, and 
Number lists. 

In one embodiment, only headings marked with style tags 
23 are recognized and oonveited, and headings set off with 
manual styles are not handled. 

B is to be understood that the above desa^tion is 
intended to be iUu^rative, and not restrictive. Many other 
embodiments will be qiparent to those of skillln the an iqpon 
30 reviewing die above descr^on. The scope of the invention 
should, dierefore, be determined with reference to the 
appended claims, along with the fiUl scope of equivalents to 
which such claims are entitled. 
What is dain^d is: 
35 1. Amethodfcx^ creating a hyperiink to text in a computer- 
readable document conqnising the steps of: 
identifying a first key topic in said document, wherein 
said identifying said first key topic step includes the 
step of analyzing text in said domnent with a computer 
40 program; 

inserting a first source andior associated with said first 

key topic into a list; and 
creating a hyperiink between said first source anchor and 
said first key topic in said document 
43 2. The method according to claim 1 furttier including the 
st^s of: 

identifying a second key topic in said document, wherein 
s£dd step id^tifying said second kQr topic includes ttie 
step of analyzing text in said document with a computer 
^ program; 

inserting a second source anchor associated with said 
second key topic into said list; and 

creating a hyperlink t)etween said second source anchor 
and said second key topic in said document 

3. The mediod according to daim 2 further including the 
steps cf: 

dividing said list into a plurality of sublists; and 
outputting each one 6i said phnality of sublists into a 
^ respective separate segment, wherein each said separ 
rate segment is individual^ loadable into a computer: 

4. The method according to daim 2 further including the 
steps of: 

creating an entry page; 
63 creating a summary page; 

inserting said list into said summary page; and creating a 
hyperlink from said entry page to said summary page. 
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5. The method acoording to claim 4 whcrcio said list 
conqmses one or more of Ifae following: a table of oontoits, 
a list of concepts, a list of abstracts and a list of piirases. 

6. The method according to daim 2 further including the 
steps of: 

creating a page; 

sdectipg a summary page ten^late; 

insoting said selected summary page ten^late in said 

page; and 
inserting said list in said page. 

7. The method according to claim 1 further induding the 
steps of: 

identifying a second key topic in said document, wherein 
said step id^itiiying said second key topic indudes the 
steps of: 

semantically analyzing text in said doomicnt wifli a 

con^uter program, and 
determining with a comfHiter program that said second 

key topic is semantically similar to said first key 

topic; and 

creating a hypedink between said first source andicn- and 
said second key tc^ic in said document 

8. The method according to daim 1 further induding the 

stq>s of: 

identifying a second key topic in said document, wherein 
said step identifying said second key topic includes the 
st^s of: 

semantically analyzing text in said document widi a 

con^Hitcr program, and 
determining with a computer program that said second 

ki^ topic is semantically similar to said first key 

topic; and 

creating a hypedink between said first key topic and said 
second key topic in said document 

9. The method according to claim 1, wherein said step of 
analyzing text indudes the st^s of: 

sdecting a threshold weight for identifying key topics; 
locathkg a candidate key topic; 
calculatiQg a weight for said candidate key topic; and 
choosing said first key topic as a result of oonopaiiDg said 

weight of said candidate key topic and said threshold 

weight 

Id. The method according to daim 1 whcxein said step of 
analyzing text indudes the step of semantically analyzing 
text in said document with a computer jffogram 

11. Hie mediod accoiding to claim 10 wherein said step 
of semantically analyzing includes the step of using a 
lexicon dictionary in which semantic weights are assigned to 
words and wherdn Hhc values of said semantic weights can 
be edited. 

12. The naethod according to claim 16 wherein said 
semantically analyzing stq) indudes the step of recog^iizing 
of noun phrases with a computer program. 

13. Hie mediod according to daim 1 herein said step of 
identifying a first key topic includes the step of detomining 
whdher said first key topic is marked as a heading. 

14. The method according to daim 1 further induding the 
steps of: 

marking as exduded a portion of said document; and 
suppressing insertion of andiocs for key topics within said 
exduded portion. 

15. The method according to daim 1 wherein at least part 
of said document is downloaded over a network. 

16. The method according to daim 1 wherein at least part 
of said document is stored on a CD-ROM. 
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17. A summary page generator for creating a hyperlink to 
text in a conqHiter-readable document conqoising: 

means for identifying a first key topic in said docnmnent* 

wherein said identifying said first key topic stq> 
^ indudes the step of analyzing text in said document 

with a counter pro-am; 
means for inserting a first source andior assodated with 

said first key topic into an list; and 
means fcH^ creating a hyperlink betwe^ said first source 

anchor and said fint key topic in said document 

18. A method for navigating to, and viewing, key topics 
in a viewer document using a first con9)Uter com^Hising the 
steps of: 

13 displaying a list on a conputa: display, ^Itot^in said list 
comprises a listing of key topics whidi was generated 
by a computer program that scanned a source document 
and identified said key topics within said source docu- 
ment; 

20 selecting a key topic from said list to be a sdected key 
topic in response to input from a human; and 
using a hypeiiink associated wifii said selected key tc^ic 
to locateanddisplay said sdected key topic in context 
in said viewer document wherein said viewer docu- 

^ ment was generated from said source document. 

19. The method according to claim 18 further induding 
the st^s of: 

loading a first segment of said viewer document from a 
second conpnter to said first computer over a network, 
^ wherein said first segment contains said list; and 
displaying said first segment 

20. The method acccxding to claim 19 further induding 
the steps of: 

53 determining that an end of said first segment has been 
displayed by said first computer; 
downloading a second segment from said second com- 

puto: to said first oon^uter; and 
displaying said second segment 
^ 21. A conqmter document data structure comprising: 
an entry p^ge; and 

a summary page. wfaca?ein said summary page indudes a 
list of index entries including a first and a second index 
entry, ulierein said first index entry indudes a first 
hypedink to a first key topic appealing in a first 
segment, said first segment comprisiag textual 
infonnation, and wherein said second index entry 
includes a second hyperlink to a second key topic 
^ appearing in a second segment said second segment 
oon^rising textual infoimaticMi, wherein said second 
key topic is substantially similar to said first key topic 

wherein said summary page is located in a third segment 
and 

55 and wherein said entry page indudes a third hyperlink to 
said sunomary page. 

22. The computer document data sinictme of daim 21, 
n^erein said first second and third hyperlinks specify on 
viiicfa computer said first, second and third segments, 

00 re^pectivdy, are located. 

23. A tool for a word-processor fsogram, the wcxd- 
processcr program capable of processing a first computer- 
readable document, the tool oonqxising: 

means for identifying a first key topic in said first 
65 document wherein said means for idoitifying said first 
key topic stq> indudes means for analyzing text in said 
first document with a conq>utcr program; 



5,71 

39 

means for insoting a first source anchor associated with 

said first ksy topic into a list; and 
means for creating a hyperlink between said first source 

andior and said first loey topic in said first document. 

24. The tod acoordiag to daim 23. wherein the means for 
analyzing text further comprises means for semanticaliy 
analyzing text in said first document with a computer 
progranL 

25. The tool according to claim23t ^ttodn said list is an 
index. 

26. The tool acamUng to claim 2S. wherein said index is 
qjpended to an end of said first document 

27. llie tod acoHding to daim 23, fiirthor oomiirislng 
means for geno'ating a second compnter-readable 
document, wherein said second document comprises h3^;>er- 
text markiQ) language (KTML) tokens and at least some 
textual information from said first document 
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28. The tool according to daim 27, wh erein said first 
document conqvises rid* text fcHinat (RTF) tokens. 

29. A page template data structure for a hypertext mark-up 
language comprising: 

5 a toten, vrtierein the token is fanhrrtdftd within a comment 
code, wherein the ccnunent code is interpreted by fiic 
hypertext mai^p language as a comment and wherein 
the token comprises a token definition which includes 
a specification for a hypeitext markup language com- 
mand. 

30. The summaiy-page template data structure according 
to daim 2S>. wherein the token definition conqsises dther a 
destination hyperlink aix:faar or a source hypedink anchor. 

31. The summaiy-page template data structure according 
to claim 29, wherein the token definition conqirises a data 

1^ placeholder. 

» » * » * 
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At Col. 14, line 54, please begin a new paragraph with —In identifying—. 

At Col. 22, line 23, please delete " " and insert 

At Col. 22, line 37, please delete " "* and insert 
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